45. Advanced Dynamo DB

Dynamo DB Accelerator (DAX):
i. This is fully managed, highly available, in-memory cache
ii. Gives 10 times performance improvement
iii. Also reduces request time from milli seconds to micro seconds even under load
iv) No need of developers to manage caching logic
v) Compatible with dynamo DB API calls

Transactions:
i) Multiple ‘all or nothing’ operations
ii) Financial transactions
iii) Fulfilling orders
iv) Two underlying reads or writes – prepare/ commit
v) Upto 25 items or 4 MB of data

On-demand capacity:
i) Pay-per-request pricing
ii) Balance cost & performance
iii) No minimum capacity
iv) No charge for read/ write – only storage & backups
v) Pay more per request than with provisioned capacity
vi) Use for new product launches

On-demand backup & restore:
i) Full backups at any time
ii) Zero impact on table performance or availability
iii) Consistent within seconds & retained until deleted
iv) Operates within same region as the source table

Point-in-time recovery (PITR):
i) Complimentary of backup & restore
ii) Protects against accidental writes or deletes
iii) Restore data to any point in the last 35 days
iv) Maintains incremental backups
v) PITR is not enabled by default. We have to turn it on manually
vi) Latest restorable timestamp is typically five minutes in the past.

Streams:
i) Streams are time-ordered sequence of item-level changes in a Dynamo table.
ii) Stream records appear in the same sequence as the item
iii) Information is stored for a period of 24 hours
iv) This provides stream of inserts, updates & deletes to your Dynamo table items
v) Structure of Stream: Stream consists of stream records. Each stream record represents a single data modification in the Dynamo DB table to which the stream belongs. Each stream record is assigned to sequence number, reflecting the order in which the record was published to the stream & stream records are organized into groups or shards.
vi) A Shard acts as a container for multiple stream records and the Shard contains info required for accessing & iterating through these records.

Question 1:
An IT company wants to review its security best-practices after an incident was reported where a new developer on the team was assigned full access to DynamoDB. The developer accidentally deleted a couple of tables from the production environment while building out a new feature. Which is the MOST effective way to address this issue so that such incidents do not recur?
Answer: Use permissions boundary to control the maximum permissions employees can grant to the IAM principals.
Explanation: As an IAM administrator, you can define one or more permissions boundaries using managed policies and allow your employee to create a principal with this boundary. The employee can then attach a permissions policy to this principal. However, the effective permissions of the principal are the intersection of the permissions boundary and permissions policy. As a result, the new principal cannot exceed the boundary that you defined.

Question 2:
A retail company has developed a REST API which is deployed in an Auto Scaling group behind an Application Load Balancer. The API stores the user data in DynamoDB and any static content, such as images, are served via S3. On analyzing the usage trends, it is found that 90% of the read requests are for commonly accessed data across all users. As a Solutions Architect, which of the following would you suggest as the MOST efficient solution to improve the application performance?
Answer: Enable DynamoDB Accelerator (DAX) for Dynamo DB and CloudFront for S3

Question 3:
A company uses DynamoDB as a data store for various kinds of customer data, such as user profiles, user events, clicks, and visited links. Some of these use-cases require a high request rate (millions of requests per second), low predictable latency, and reliability. The company now wants to add a caching layer to support high read volumes. As a solutions architect, which of the following AWS services would you recommend as a caching layer for this use-case? (Select two)
Answer: a. DynamoDB Accelerator (DAX)
b. ElastiCache

Question 4:
A retail company has developed a REST API which is deployed in an Auto Scaling group behind an Application Load Balancer. The API stores the user data in DynamoDB and any static content, such as images, are served via S3. On analyzing the usage trends, it is found that 90% of the read requests are for commonly accessed data across all users.
As a Solutions Architect, which of the following would you suggest as the MOST efficient solution to improve the application performance?
Options:
A. Enable ElastiCache Redis for DynamoDB and CloudFront for S3
B. Enable DAX for DynamoDB and ElastiCache Memcached for S3
C. Enable DynamoDB Accelerator (DAX) for DynamoDB and CloudFront for S3
D. Enable ElastiCache Redis for DynamoDB and ElastiCache Memcached for S3
Answer: C
Explanation
Correct option:
Enable DynamoDB Accelerator (DAX) for DynamoDB and CloudFront for S3
DynamoDB Accelerator (DAX) is a fully managed, highly available, in-memory cache for Amazon DynamoDB that delivers up to a 10 times performance improvement—from milliseconds to microseconds—even at millions of requests per second.
DAX is tightly integrated with DynamoDB—you simply provision a DAX cluster, use the DAX client SDK to point your existing DynamoDB API calls at the DAX cluster, and let DAX handle the rest. Because DAX is API-compatible with DynamoDB, you don’t have to make any functional application code changes. DAX is used to natively cache DynamoDB reads.
CloudFront is a content delivery network (CDN) service that delivers static and dynamic web content, video streams, and APIs around the world, securely and at scale. By design, delivering data out of CloudFront can be more cost-effective than delivering it from S3 directly to your users.
When a user requests content that you serve with CloudFront, their request is routed to a nearby Edge Location. If CloudFront has a cached copy of the requested file, CloudFront delivers it to the user, providing a fast (low-latency) response. If the file they’ve requested isn’t yet cached, CloudFront retrieves it from your origin – for example, the S3 bucket where you’ve stored your content.
So, you can use CloudFront to improve application performance to serve static content from S3.
Incorrect options:
Enable ElastiCache Redis for DynamoDB and CloudFront for S3
Amazon ElastiCache for Redis is a blazing fast in-memory data store that provides sub-millisecond latency to power internet-scale real-time applications. Amazon ElastiCache for Redis is a great choice for real-time transactional and analytical processing use cases such as caching, chat/messaging, gaming leaderboards, geospatial, machine learning, media streaming, queues, real-time analytics, and session store.
Although you can integrate Redis with DynamoDB, it’s much more involved than using DAX which is a much better fit.
Enable DAX for DynamoDB and ElastiCache Memcached for S3
Enable ElastiCache Redis for DynamoDB and ElastiCache Memcached for S3
Amazon ElastiCache for Memcached is a Memcached-compatible in-memory key-value store service that can be used as a cache or a data store. Amazon ElastiCache for Memcached is a great choice for implementing an in-memory cache to decrease access latency, increase throughput, and ease the load off your relational or NoSQL database.
ElastiCache cannot be used as a cache to serve static content from S3, so both these options are incorrect.

Question 5:
The engineering team at an in-home fitness company is evaluating multiple in-memory data stores with the ability to power its on-demand, live leaderboard. The company’s leaderboard requires high availability, low latency, and real-time processing to deliver customizable user data for the community of users working out together virtually from the comfort of their home.
As a solutions architect, which of the following solutions would you recommend? (Select two)
Options:
A. Power the on-demand, live leaderboard using DynamoDB with DynamoDB Accelerator (DAX) as it meets the in-memory, high availability, low latency requirements
B. Power the on-demand, live leaderboard using AWS Neptune as it meets the in-memory, high availability, low latency requirements
C. Power the on-demand, live leaderboard using DynamoDB as it meets the in-memory, high availability, low latency requirements
D. Power the on-demand, live leaderboard using RDS Aurora as it meets the in-memory, high availability, low latency requirements
E. Power the on-demand, live leaderboard using ElastiCache Redis as it meets the in-memory, high availability, low latency requirements
Answer: E
Explanation
Correct options:
Power the on-demand, live leaderboard using ElastiCache Redis as it meets the in-memory, high availability, low latency requirements
Amazon ElastiCache for Redis is a blazing fast in-memory data store that provides sub-millisecond latency to power internet-scale real-time applications. Amazon ElastiCache for Redis is a great choice for real-time transactional and analytical processing use cases such as caching, chat/messaging, gaming leaderboards, geospatial, machine learning, media streaming, queues, real-time analytics, and session store. ElastiCache for Redis can be used to power the live leaderboard, so this option is correct.
Power the on-demand, live leaderboard using DynamoDB with DynamoDB Accelerator (DAX) as it meets the in-memory, high availability, low latency requirements
Amazon DynamoDB is a key-value and document database that delivers single-digit millisecond performance at any scale. It’s a fully managed, multiregion, multimaster, durable database with built-in security, backup and restore, and in-memory caching for internet-scale applications. DAX is a DynamoDB-compatible caching service that enables you to benefit from fast in-memory performance for demanding applications. So DynamoDB with DAX can be used to power the live leaderboard.
Incorrect options:
Power the on-demand, live leaderboard using AWS Neptune as it meets the in-memory, high availability, low latency requirements – Amazon Neptune is a fast, reliable, fully-managed graph database service that makes it easy to build and run applications that work with highly connected datasets. Neptune is not an in-memory database, so this option is not correct.
Power the on-demand, live leaderboard using DynamoDB as it meets the in-memory, high availability, low latency requirements – DynamoDB is not an in-memory database, so this option is not correct.
Power the on-demand, live leaderboard using RDS Aurora as it meets the in-memory, high availability, low latency requirements – Amazon Aurora is a MySQL and PostgreSQL-compatible relational database built for the cloud, that combines the performance and availability of traditional enterprise databases with the simplicity and cost-effectiveness of open source databases. Amazon Aurora features a distributed, fault-tolerant, self-healing storage system that auto-scales up to 64TB per database instance. Aurora is not an in-memory database, so this option is not correct.

Question 6:
A company uses DynamoDB as a data store for various kinds of customer data, such as user profiles, user events, clicks, and visited links. Some of these use-cases require a high request rate (millions of requests per second), low predictable latency, and reliability. The company now wants to add a caching layer to support high read volumes.
As a solutions architect, which of the following AWS services would you recommend as a caching layer for this use-case? (Select two)
Options:
A. DynamoDB Accelerator (DAX)
B. ElastiCache
C. Elastisearch
D. RDS
E. Redshift
Answer: A & B
Explanation
Correct options:
DynamoDB Accelerator (DAX) – Amazon DynamoDB Accelerator (DAX) is a fully managed, highly available, in-memory cache for DynamoDB that delivers up to a 10x performance improvement – from milliseconds to microseconds – even at millions of requests per second. DAX does all the heavy lifting required to add in-memory acceleration to your DynamoDB tables, without requiring developers to manage cache invalidation, data population, or cluster management. Therefore, this is a correct option.
ElastiCache – Amazon ElastiCache for Memcached is an ideal front-end for data stores like Amazon RDS or Amazon DynamoDB, providing a high-performance middle tier for applications with extremely high request rates and/or low latency requirements. Therefore, this is also a correct option.
Incorrect options:
RDS – Amazon Relational Database Service (Amazon RDS) makes it easy to set up, operate, and scale a relational database in the cloud. It provides cost-efficient and resizable capacity while automating time-consuming administration tasks such as hardware provisioning, database setup, patching, and backups. RDS cannot be used as a caching layer for DynamoDB.
Elasticsearch – Elasticsearch is a search engine based on the Lucene library. It provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents. It cannot be used as a caching layer for DynamoDB.
Redshift – Amazon Redshift is a fully-managed petabyte-scale cloud-based data warehouse product designed for large scale data set storage and analysis. It cannot be used as a caching layer for DynamoDB.