3. S3

S3 = Simple Storage Service
S3 is used to store objects and one of the main building blocks of AWS.
Its advertised as ‘infinitely scaling’ storage. Means we can store as many objects as we want. Many websites use S3 as a backbone. Many AWS services uses S3 as an integration as well. For example, the EBS snapshots are actually stored in S3 but we dont see them.

S3 provides developers and IT teams with secure, durable, highly-scalable object storage. S3 is easy to use with a simple web services interface to store and retrieve any amount of data from anywhere on the web. S3 is a sage place to store files and its an Object-based storage. The data is spread across multiple devices and facilities.

Basics of S3:
i) S3 is object based – i.e allows to upload files
ii) Files can be from 0 bytes to 5 TB
iii) There is an unlimited storage
iv) Files are stored in Buckets.
v) S3 is a universal namespace. i.e the names must be unique globally. Its actually creating a web address so they have to be unique. Suppose we are going to create a bucket in Northern Virginia (default AWS region) with name as ‘testbucket’ then we are going to have a web address as https://testbucket.s3.amazonaws.com/ If we are going to create a bucket in ireland then web address would be https://testbucket.eu-west-1.amazonaws.com/
*vi) When we upload file into S3 then we will receive a HTTP 200 code on the browser, if the upload was successful.

Use cases:
i) We can do backup and storage
ii) We can use it for disaster recovery and capture data on S3 across different regions.
iii) We can also archive data until S3 is free.
iv) We can have hybrid cloud storage
v) Application and Media hosting
vi) Data lakes and big data analytics

S3 Buckets Overview:
i) S3 allows people to store objects(files) in buckets(directories).
ii) Buckets must have a globally unique name (across all regions all accounts)
iii) Buckets are defined at region level
iv) S3 is a global service but buckets are created in a region
v) Follow naming convention like : No uppercase, No underscore, 3-63 characters long, not an IP, must start with lowercase letter or number.

S3 Objects Overview:
Objects consist the following:
i) Key – This is simply the name of the object (file).
ii) Value – This is simply the data and is made up of a sequence if bytes.
iii) Version ID – If versioning is enabled and is important for versioning.
iv) Metadata (data about data you are storing) – Like this object belongs to Finance department or HR department.
v) Subresources are ACL (permissions of the object. We can lock each object individually or lock entire bucket), Torrent,
vi) The key represents the full path to the objects. Ex: s3://my-bucket/my_object.txt
vii) The key is composed of prefix + object name. Ex: s3://my-bucket/folder1/folder2/my_object.txt
Here folder1/folder2 is prefix and my_object.txt is an object.
viii) There is no concept of directories within buckets
ix) Just keys with very long names that contain slashes (/)
x) Object values are the content of the body
xi) Max object size is 5TB (5000GB). If uploading more than 5GB, must use ‘multi part upload’.
xii) Tags (unicode key-value pair up to 10) – useful for security/ lifecycle.

S3 Data Consistency:
i) Read after write consistency for PUTS of new objects — Able to read immediately once we upload the file. If we write a new file and read it immediately afterwards, we will be able to view that data.
ii) Eventual consistency for overwrite PUTS and DELETES (can take some time to propagate) — We have version 1 file in S3 and we upload version 2 and we immediately try to read the object then we might get either version 1 or version 2 but if you want for couple of seconds then we always get version 2 object. So its only when we try to overwrite or delete a file eventually its going to be consistent. If we update an existing file or delete a file and read it immediately, we may get older version or may get latest version. Basically changes to objects can take a little bit of time to propagate.

S3 guarantees:
i) S3 is built for 99.99% availability. However Amazon guarantees 99.9% availability.
ii) Amazon guarantees 99.9999999% durability (11*9s)

S3 Features:
i) Tiered storage — We have different tiered storages available
ii) Lifecycle Management — We can move the objects between tiers based on the number of the days. Like if the file is 30 days old move to this tier, if the file is 90 days old move to this tier.
iii) Versioning — We can have multiple versions of objects in S3 buckets. We can also encrypt these objects.
iv) MFA Delete — We use MFA for deletion of objects.
v) Secure data using ACLs and Bucket Policies

S3 Storage Classes:

S3 Standard S3 – IA (Infrequently Accessed) S3 One Zone – IA S3 – Intelligent Tiering S3 Glacier S3 Glacier Deep Archive
99.99% availability and 99.999999999% durability. Stored redundantly across multiple devices in multiple facilities and is designed to sustain the loss of 2 facilities concurrently. For data that is accessed less frequently, but requires rapid access when needed. Lower fee than S3, but you are charged a retrieval fee. For where you want lower-cost option for IA data, but do not require the multiple AZ data resilience. Sometimes we can refer this storage as RRS (Reduce Redundancy Storage) which is depleted. Designed to optimize costs by automatically moving data to the most cost-effective access tier, w/o performance impact or operational overhead. Secure, durable and low-cost storage class for data archiving. We can reliably store any amount of data at costs that are  cheaper than on-premises solutions. Retrieval times configurable from minutes to hours. Lowest-cost storage class where a retrieval time of 12 hours is acceptable

S3 Comparison:

S3 Standard S3 Intelligent Tiering S3 Standard IA S3 One Zone IA S3 Glacier S3 Glacier Deep Archive
Designed for durability 99.999999999% (11 9’s) 11 9’s 11 9’s 11 9’s 11 9’s 11 9’s
Designed for availability  99.99% 99.9% 99.9% 99.5% NA NA
Availability SLA  99.9% 99% 99% 99% NA NA
Availability Zones  >=3  >=3  >=3 1  >=3  >=3
Min capacity charge per object  NA NA 128KB 128KB 40KB 40KB
Min storage duration charge  NA 30 days 30 days 30 days 90 days 180 days
Retrieval fee  NA NA per GB retrieved per GB retrieved per GB retrieved per GB retrieved
First byte latency  milliseconds  milliseconds  milliseconds  milliseconds minutes or hours hours
Example: Public Access – Use Bucket Policy
We have a S3 bucket and an user who is not using your account.
Anonymous www website visitor — Trying to read files from — S3 Bucket
By default when we try to read a file using web browser, access will be denied. To solve this we have to attach a S3 bucket policy to S3 bucket which is going to allow public access.

Example: User Access to S3 – IAM Permissions
We have a S3 bucket and an user who is within our account. We attach an IAM policy to user saying that user can access the S3 buckets. Here we dont need any extra bucket policy as we did on public access.

Example: EC2 instance access – Use IAM Roles
EC2 instance wants to access S3 buckets == Create an EC2 instance role and then attach an IAM permissions to that EC2 instance role, then EC2 instance will be able to access S3 buckets.

Advanced: Cross Account Access – Use Bucket Policy
Use an extra bucket policy for cross account access. Suppose we have an IAM user in another account and it common to have multiple accounts and we want to give this user to access the content of S3 bucket. Then create an S3 bucket policy which allows cross account access and user from another account will be able to access S3 buckets.

S3 Bucket Policies:
i) They are JSON based policies. Just like IAM policies and they look very similar. We define Resource – buckets and objects that we access. We define Action – Set of API to allow or deny. Effect – Allow or deny. Principal – The account or user to apply the policy to.
ii) Use S3 bucket policy to grant public access to the bucket. Also to force objects to be encrypted at upload. Also to grant access to another account, also called as Cross Account.

S3 Websites:
S3 can host static websites and have them accessible on the www
The website url will be: <bucket-name>.s3-website-<aws-region>.amazon.aws.com or <bucket-name>.s3-website.<aws-region>.amazon.aws.com
In case we do not make S3 bucket public in first place then we get 403 (Forbidden) error.

Tips:
i) S3 is object based and allows to upload files. Objects (files) are stored in buckets.
ii)File size can be from 0 bytes to 5 TB
iii) There is an unlimited storage
iv) S3 is a universal namespace. That is, names must be unique globally.
v) In default region US East (N. Virginia) the domain name will be
https://testbucket.s3.amazonaws.com/
In other regions (ireland here) it will be https://testbucket.eu-west-1.amazonaws.com/
vi) Not suitable to install an OS or DB on S3 as S3 is object based. We need block based storage.
vi) Successful uploads will generate a HTTP 200 status code
vii) We can turn on MFA delete
viii) Control access to buckets either using a bucket ACL or using bucket policies.

Question 1:
What two statements correctly describe versioning for protecting data at rest on
S3 buckets?
A. enabled by default
B. overwrites most current file version
C. restores deleted files
D. saves multiple versions of a single file
E. disabled by default
Answer (C,E)

Question 2:
You have a requirement to create an index to search customer objects stored in
S3 buckets. The solution should enable you to create a metadata search index for
each object stored to an S3 bucket. Select the most scalable and cost effective
solution?
A. RDS, ElastiCache
B. DynamoDB, Lambda
C. RDS, EMR, ALB
D. RedShift
Answer (B)

Question 3:
A junior scientist working with the Deep Space Research Laboratory at NASA is trying to upload a high-resolution image of a nebula into Amazon S3. The image size is approximately 3GB. The junior scientist is using S3 Transfer Acceleration (S3TA) for faster image upload. It turns out that S3TA did not result in an accelerated transfer. Given this scenario, which of the following is correct regarding the charges for this image transfer?
Answer: Does not need to pay any transfer charges for the image upload.
Explanation: No S3 data transfer charges when data is transferred in from the internet. Also with S3TA, pay only for transfers that are accelerated. Since S3TA did not result in an accelerated transfer so no transfer charges.

Question 4:
An audit department generates and accesses the audit reports only twice in a financial year. The department uses AWS Step Functions to orchestrate the report creating process that has failover and retry scenarios built into the solution. The underlying data to create these audit reports is stored on S3, runs into hundreds of Terabytes and should be available with millisecond latency. As a solutions architect, which is the MOST cost-effective storage class that you would recommend to be used for this use-case?
Answer: Amazon S3 Standard-Infrequent Access (S3 Standard-IA)
Explanation: Since the data is accessed only twice in a financial year but needs rapid access when required, the most cost-effective storage class for this use-case is S3 Standard-IA

Question 5:
The IT department at a consulting firm is conducting a training workshop for new developers. As part of an evaluation exercise on Amazon S3, the new developers were asked to identify the invalid storage class lifecycle transitions for objects stored on S3. Can you spot the INVALID lifecycle transitions from the options below? (Select two)
Answer: S3 Intelligent Tiering => S3 Standard and S3 One Zone IA => S3 Standard IA
Explanation: Unsupported life cycle transitions for S3 storage classes Supported life cycle transitions for S3 storage classes
Any storage class to the S3 Standard storage class The S3 Standard storage class to any other storage class
Any storage class to the Reduced Redundancy storage class Any storage class to the S3 Glacier or S3 Glacier Deep Archive storage classes
The S3 Intelligent-Tiering storage class to the S3 Standard-IA storage class The S3 Standard-IA storage class to the S3 Intelligent-Tiering or S3 One Zone-IA storage classes
The S3 One Zone-IA storage class to the S3 Standard-IA or S3 Intelligent-Tiering storage classes The S3 Intelligent-Tiering storage class to the S3 One Zone-IA storage class
The S3 Glacier storage class to the S3 Glacier Deep Archive storage class

Question 6:
A leading video streaming service delivers billions of hours of content from Amazon S3 to customers around the world. Amazon S3 also serves as the data lake for its big data analytics solution. The data lake has a staging zone where intermediary query results are kept only for 24 hours. These results are also heavily referenced by other parts of the analytics pipeline. Which of the following is the MOST cost-effective strategy for storing this intermediary query data?
Answer: Store the intermediary query results in S3 Standard storage class.
Explanation: S3 Standard storage class S3 Intelligent-Tiering storage class S3 Standard-Infrequent Access storage S3 One Zone-Infrequent Access storage
S3 Standard offers high durability, availability, and performance object storage for frequently accessed data. Because it delivers low latency and high throughput, S3 Standard is appropriate for a wide variety of use cases, including cloud applications, dynamic websites, content distribution, mobile and gaming applications, and big data analytics. As there is no minimum storage duration charge and no retrieval fee (remember that intermediary query results are heavily referenced by other parts of the analytics pipeline), this is the MOST cost-effective storage class amongst the given options. The S3 Intelligent-Tiering storage class is designed to optimize costs by automatically moving data to the most cost-effective access tier, without performance impact or operational overhead. It works by storing objects in two access tiers: one tier that is optimized for frequent access and another lower-cost tier that is optimized for infrequent access. The minimum storage duration charge is 30 days, so this option is NOT cost-effective because intermediary query results need to be kept only for 24 hours. Hence this option is not correct. S3 Standard-IA is for data that is accessed less frequently but requires rapid access when needed. S3 Standard-IA offers high durability, high throughput, and low latency of S3 Standard, with a low per GB storage price and per GB retrieval fee. This combination of low cost and high performance makes S3 Standard-IA ideal for long-term storage, backups, and as a data store for disaster recovery files. The minimum storage duration charge is 30 days, so this option is NOT cost-effective because intermediary query results need to be kept only for 24 hours. Hence this option is not correct. S3 One Zone-IA is for data that is accessed less frequently but requires rapid access when needed. Unlike other S3 Storage Classes which store data in a minimum of three Availability Zones (AZs), S3 One Zone-IA stores data in a single AZ and costs 20% less than S3 Standard-IA. The minimum storage duration charge is 30 days, so this option is NOT cost-effective because intermediary query results need to be kept only for 24 hours. Hence this option is not correct.
To summarize again, S3 Intelligent-Tiering, S3 Standard-IA, and S3 One Zone-IA have a minimum storage duration charge of 30 days (so instead of 24 hours, you end up paying for 30 days). S3 Standard-IA and S3 One Zone-IA also have retrieval charges (as the results are heavily referenced by other parts of the analytics pipeline, so the retrieval costs would be pretty high). Therefore, these 3 storage classes are not cost optimal for the given use-case.

Question 7:
A media agency stores its re-creatable assets on Amazon S3 buckets. The assets are accessed by a large number of users for the first few days and the frequency of access falls down drastically after a week. Although the assets would be accessed occasionally after the first week, but they must continue to be immediately accessible when required. The cost of maintaining all the assets on S3 storage is turning out to be very expensive and the agency is looking at reducing costs as much as possible. As a Solutions Architect, can you suggest a way to lower the storage costs while fulfilling the business requirements?
Answer: Configure a lifecycle policy to transition the objects to S3 One Zone-Infrequent Access (S3 One Zone-IA) after 30 days.
Explanation: S3 One Zone-IA is for data that is accessed less frequently, but requires rapid access when needed. Unlike other S3 Storage Classes which store data in a minimum of three Availability Zones (AZs), S3 One Zone-IA stores data in a single AZ and costs 20% less than S3 Standard-IA. S3 One Zone-IA is ideal for customers who want a lower-cost option for infrequently accessed and re-creatable data but do not require the availability and resilience of S3 Standard or S3 Standard-IA. The minimum storage duration is 30 days before you can transition objects from S3 Standard to S3 One Zone-IA. S3 One Zone-IA offers the same high durability, high throughput, and low latency of S3 Standard, with a low per GB storage price and per GB retrieval fee. S3 Storage Classes can be configured at the object level, and a single bucket can contain objects stored across S3 Standard, S3 Intelligent-Tiering, S3 Standard-IA, and S3 One Zone-IA. You can also use S3 Lifecycle policies to automatically transition objects between storage classes without any application changes.

Question 8:
A news network uses Amazon S3 to aggregate the raw video footage from its reporting teams across the US. The news network has recently expanded into new geographies in Europe and Asia. The technical teams at the overseas branch offices have reported huge delays in uploading large video files to the destination S3 bucket. Which of the following are the MOST cost-effective options to improve the file upload speed into S3? (Select two)
Answer: a. Use Amazon S3 Transfer Acceleration to enable faster file uploads into the destination S3 bucket.
b. Use multipart uploads for faster file uploads into the destination S3 buckets.
Explanation: Amazon S3 Transfer Acceleration enables fast, easy, and secure transfers of files over long distances between your client and an S3 bucket. Transfer Acceleration takes advantage of Amazon CloudFront’s globally distributed edge locations. As the data arrives at an edge location, data is routed to Amazon S3 over an optimized network path.
Multipart upload allows you to upload a single object as a set of parts. Each part is a contiguous portion of the object’s data. You can upload these object parts independently and in any order. If transmission of any part fails, you can retransmit that part without affecting other parts. After all parts of your object are uploaded, Amazon S3 assembles these parts and creates the object. In general, when your object size reaches 100 MB, you should consider using multipart uploads instead of uploading the object in a single operation. Multipart upload provides improved throughput, therefore it facilitates faster file uploads.

Question 9:
Your company is designing a web application that stores static content in an Amazon S3 bucket. As a non-functional requirement of the application, this bucket must handle more than 150 PUT requests per second quickly.
What should you do to ensure optimal performance?
Options:
A. Use a random prefix for object key name
B. Use a prefix such as date for object key name
C. Use a multi part upload
D. Enable S3 lifecycle rule
Answer: B
Explanation:
Option B is the correct answer. Amazon S3 can automatically improve performance so that it can support at least 3,500 requests / second when adding data with existing settings, and it can support 5,500 requests / second when retrieving data. Previously, performance improvements with the S3 prefix were essential for S3 to deliver this performance, but S3’s existing settings can now support request rates.
Option 1 is incorrect. This setting was previously correct, but with improved S3 request rate performance, you no longer need to set the object prefix to random.
Option 3 is incorrect. Multipart upload is a feature used when uploading large files to S3 and has no effect on this requirement.
Option 4 is incorrect. S3 lifecycle rules have nothing to do with improving processing performance.

Question 10:
Your company wants to use AWS as a mechanism for managing their documents. Documents stored by your company may be used frequently in the early stages, but after four months they will be used less frequently, so you will need to archive the documents appropriately.
Which AWS service settings do you need to configure to meet this requirement?
Options:
A. Set a life cycle rule to store data in EBS and move to S3 after 4 months
B. Set a life cycle rule to store data in S3 Standard and move to Glacier after 4 months
C. Set a life cycle rule to store data in EFS and move to Glacier after 4 months
D. Set a life cycle rule to store data in S3 RRS and move to Glacier after 4 months
Answer: B
Explanation:
Documents are stored in S3, and the life cycle policy is set to move to a storage type with lower cost.
In the early stages, documents are accessed frequently, so you need a storage type with suitable access efficiency, like S3 Standard. After that, it is common to use Glacier (or Glacier deep archive) as storage for long-term storage.

Question 11:
The following bucket policy sets permissions for S3 buckets.

{
"Version": "2012-10-17",
"Id": "S3PolicyId1",
"Statement": [
{
"Sid": "IPAllow",
"Effect": "Deny",
"Principal": "*",
"Action": "s3:*",
"Resource": "arn:aws:s3:::examplebucket/*",
"Condition": {
"NotIpAddress": {"aws:SourceIp": "54.240.143.0/24"}
}
}
]
}

Select the correct description of this setting.
Options:
A. All actions from the specified IP address range can be performed on this S3 bucket
B. All actions can be performed on this S3 bucket from outside the specified IP address range
C. Access to this S3 bucket from the specified IP address range is denied
D. Access to this S3 bucket from outside of the specified IP address range is denied
Answer: D
Explanation
In this bucket policy, the first half of the statement denies all actions from all users to the example bucket.
“Effect”: “Deny”,
“Principal”: “*”,
“Action”: “s3:*”,
“Resource”: “arn:aws:s3:::examplebucket/*”,
The latter statement specifies 54.240.143.0/24 as a condition for the allowed IP address range. The Condition block uses the NotIpAddress condition and aws: SourceIp. Since the NotIpAddress condition is used here, it means that IP addresses other than 54.240.143.0/24 are affected by this policy.
“Condition”: {
“NotIpAddress”: {“aws:SourceIp”: “54.240.143.0/24”}
}
Therefore, access to objects in this S3 bucket is denied if the request to this bucket is from an IP outside of the specified IP address range. Therefore, option 4 is the correct answer.

Question 12:
You have set up S3 for your data management application. This application makes several requests, including read / write and update, on objects in the S3 bucket.
If you update an object with the same key name, how will the updated object be reflected on? E.g will there be any error/discrepancies in the object upon inspection after update?
Options:
A. Since S3 uses eventaul consistency models, there may be differences when reflecting data
B. Since S3 uses eventaul consistency models, there is no difference when reflecting data
C. Since S3 uses strong consistency models, there may be differences when reflecting data
D. Since S3 uses strong consistency models, there is no difference when reflecting data
Answer: D
Explanation
Option 4 is the correct answer. S3 utilizes a strong consistency model, so there are no errors in reflection. Before December 2020, S3 used an eventual consistency model. If an update was made on an object with the same key name as the original object, the read request immediately after might not reflect the updated object. However, S3 now uses a strong consistency model, so these discrepancies no longer occur.
S3 adopted the “strong consistency model” for data registration / update / deletion.
Options 1 and 2 are incorrect. S3 used an eventual consistency model, but recently it as now improved to a strong consistency model.
Option 3 is incorrect. S3 now utilizes a strong consistency model, which eliminates the possibility of reflection errors.

Question 13:
Your company uses a business application hosted on AWS to manage records related to daily business. According to industry regulations, recorded data must be retained for 5 years. Most of these archives are rarely accessed, but data must be provided within 24 hours in response to an audit request.
Which of the following storage should you choose as the most cost-effective storage?
Options:
A. Amazon Glacier (standard retrieval)
B. Amazon S3 Glacier Deep Archive
C. S3 Standard
D. S3 One Zone IA
E. S3 Standard IA
Answer: B
Explanation
Option 2 is the correct answer. In this scenario, storage requirements are cost-effective to store data over the medium to long term and extract data within 24 hours. The Glacier Deep Archive storage class is designed to offer durable, secure, high-volume data storage at the lowest prices on AWS. Data is stored across three or more of his AWS Availability Zones and can be retrieved within 12 hours.
Option 1 is incorrect. Glacier is cheap and suitable for long-term storage of data, but it is a storage that takes several hours to acquire data. Data can be acquired in about 1 to 5 minutes by using quick reading. However, the Glacier Deep Archive storage class is cheaper than Glacier.
Option 3 is incorrect. S3 Standard is the most costly data storage in S3 and does not meet this requirement.
Option 4 is incorrect. S3 One Zone-IA saves money by storing infrequently accessed data in a single, less resilient, Availability Zone. However, the Glacier Deep Archive storage class is cheaper than the S3 One Zone-IA.
Option 5 is incorrect. Standard-IA is for infrequent access, but it can be read quickly, so it can be used suddenly. However, the Glacier Deep Archive storage class is cheaper than Standard-IA.

Question 14:
As a Solutions Architect, you use AWS to build solutions for managing and storing corporate documents. Once the data is saved, it is rarely used, but it is required to be obtained within 10 hours according to the administrator’s instructions if necessary. You have decided to use Amazon Glacier and are considering how to set it up.
How should you set the data acquisition method for Glacier?
Options:
A. Expedited retrievals
B. Standard retrievals
C. Bulk retrievals
D. Vault lock
Explanation
Glacier’s standard retrieval is the optimal setting because of the requirement to retrieve the data within 10 hours according to the administrator’s instructions as needed. With standard retrieval, you can access all archives within 3-5 hours. Therefore, option 2 is the correct answer.
Option 1 is incorrect. Glacier’s Expedited retrievals gives you quick access to your data if you need a subset of your archives quickly. For all archives except the largest archives (250 MB and above), the data accessed with Expedited retrievals is typically available within 1-5 minutes. However, Expedited retrievals are not cost-optimal or preferred in this situaiton.
Option 3 is incorrect. Bulk retrievals is Glacier’s cheapest retrieval option, which allows you to retrieve large amounts of data (including petabytes of data) within a day. Bulk retrievals typically takes 5-12 hours, so data acquisition cannot be completed within 10 hours.
Option 4 is incorrect. Glacier vault locks allow you to easily deploy and apply compliance management for each Glacier vault using vault lock policies. Specify a control such as write once read many (WORM) in the vault lock policy to lock the policy so that it cannot be edited in the future. This feature is irrelevant to this requirement.

Question 15:
Your company develops and operates an application that provides image data in the public domain. The image data is stored in S3, and the application temporarily displays it in response to the user’s request. This image should be protected so that it is only available to specific users.
What mechanism do I need to use to meet this requirement?
Options:
A. Distribute images with a time-limited pre-signed URL
B. Image distribution by CloudFront distribution
C. Protect your images with an encryption key
D. Limit users by switching to EFS image sharing
Explanation
If you create a pre-signed URL and have permissions to the object, only the user who has the pre-signed URL can access the object. By using this function, the application can grant a specific user permission to the target image for a limited time. Therefore, option 1 is the correct answer.
Option 2 is wrong. It is not possible to limit the users to whom images are delivered to using CloudFront delivery settings alone. It is necessary to use signed URLs and signed cookies in CloudFront.
Option 3 is wrong. There is no setting that allows a specific user to share images with an encryption key.
Option 4 is wrong. EFS is a storage that allows data sharing between instances, but it cannot be accessed by a third party via the Internet. Therefore, it is more appropriate to use S3 as a storage service for showing data to the outside.

Question 16:
Some companies store employee user profiles and access logs in S3. As this data is uploaded and modified on a daily basis, there is a concern that users may accidentally delete objects in their S3 bucket. Therefore, it is necessary to take preventive measures, but it should not affect the business.
Choose the best way to prevent accidental deletion of objects in your S3 bucket (Select two)
Options:
A. Enable the versioning feature on S3 bucket
B. Enable encryption in S3 bucket
C. Enable MFA authentication on S3 bucket
D. Set data deletion not possible for S3 bucket
E. Set deletion refusal by IAM role in S3 bucket
Answer: A & C
Explanation
By enabling MFA authentication for your S3 bucket, users will be required to perform MFA authentication every time they try to perform a deletion process, which will prevent deletion due to operational mistakes. Furthermore, you can restore deleted files by enabling the versioning function. Therefore, options 1 and 3 are correct.
Option 2 is incorrect. You can increase data protection by enabling encryption in your S3 bucket, but it does not prevent data loss.
Option 4 is incorrect. The S3 bucket can be configured so that objects cannot be deleted by default, but this is only available during initial setup. You can’t change the settings of an S3 bucket that you’re already using. In addition, there are cases where data deletion operations are required, which is inappropriate in this case.
Option 5 is incorrect. Access permissions must be set by the IAM user, not the IAM role.

Question 17:
A video production company is planning to move some of its workloads to the AWS Cloud. The company will require around 5 TB of storage for video processing with the maximum possible I/O performance. They also require over 400 TB of extremely durable storage for storing video files and 800 TB of storage for long-term archival.
Which combinations of services should a Solutions Architect use to meet these requirements?
Options:
A. Amazon EC2 instance store for maximum performance, Amazon EFS for durable data storage, and Amazon S3 for archival storage
B. Amazon EC2 instance store for maximum performance, Amazon S3 for durable data storage, and Amazon S3 Glacier for archival storage
C. Amazon EBS for maximum performance, Amazon EFS for durable data storage, and Amazon S3 Glacier for archival storage
D. Amazon EBS for maximum performance, Amazon S3 for durable data storage, and Amazon S3 Glacier for archival storage
Answer: B
Explanation
The best I/O performance can be achieved by using instance store volumes for the video processing. This is safe to use for use cases where the data can be recreated from the source files so this is a good use case.
For storing data durably Amazon S3 is a good fit as it provides 99.999999999% of durability. For archival the video files can then be moved to Amazon S3 Glacier which is a low cost storage option that is ideal for long-term archival.
CORRECT: “Amazon EC2 instance store for maximum performance, Amazon S3 for durable data storage, and Amazon S3 Glacier for archival storage” is the correct answer.
INCORRECT: “Amazon EBS for maximum performance, Amazon S3 for durable data storage, and Amazon S3 Glacier for archival storage” is incorrect. EBS is not going to provide as much I/O performance as an instance store volume so is not the best choice for this use case.
INCORRECT: “Amazon EC2 instance store for maximum performance, Amazon EFS for durable data storage, and Amazon S3 for archival storage” is incorrect. EFS does not provide as much durability as Amazon S3 and will not be as cost-effective.
INCORRECT: “Amazon EBS for maximum performance, Amazon EFS for durable data storage, and Amazon S3 Glacier for archival storage” is incorrect. EBS and EFS are not the best choices here as described above.

Question 18:
A company has uploaded some highly critical data to an Amazon S3 bucket. Management are concerned about data availability and require that steps are taken to protect the data from accidental deletion. The data should still be accessible, and a user should be able to delete the data intentionally.
Which combination of steps should a solutions architect take to accomplish this? (Select TWO.)
Options:
A. Enable MFA delete on the S3 bucket
B. Create a bucket policy on the S3 bucket
C. Enable default encryption on the S3 bucket
D. Enable versioning on the S3 bucket
E. Create a lifecycle policy for the objects in the S3 bucket
Answer: A & D
Explanation
Multi-factor authentication (MFA) delete adds an additional step before an object can be deleted from a versioning-enabled bucket.
With MFA delete the bucket owner must include the x-amz-mfa request header in requests to permanently delete an object version or change the versioning state of the bucket.
CORRECT: “Enable versioning on the S3 bucket” is a correct answer.
CORRECT: “Enable MFA Delete on the S3 bucket” is also a correct answer.
INCORRECT: “Create a bucket policy on the S3 bucket” is incorrect. A bucket policy is not required to enable MFA delete.
INCORRECT: “Enable default encryption on the S3 bucket” is incorrect. Encryption does protect against deletion.
INCORRECT: “Create a lifecycle policy for the objects in the S3 bucket” is incorrect. A lifecycle policy will move data to another storage class but does not protect against deletion.

Question 19:
A solutions architect is creating a document submission application for a school. The application will use an Amazon S3 bucket for storage. The solution must prevent accidental deletion of the documents and ensure that all versions of the documents are available. Users must be able to upload and modify the documents.
Which combination of actions should be taken to meet these requirements? (Select TWO.)
Options:
A. Enable MFA delete on the bucket
B. Encrypt the bucket using AWS SSE-S3
C. Set read-only permissions on the bucket
D. Attach an IAM policy to the bucket
E. Enable versioning on the bucket
Answer: A & E
Explanation
None of the options present a good solution for specifying permissions required to write and modify objects so that requirement needs to be taken care of separately. The other requirements are to prevent accidental deletion and the ensure that all versions of the document are available.
The two solutions for these requirements are versioning and MFA delete. Versioning will retain a copy of each version of the document and multi-factor authentication delete (MFA delete) will prevent any accidental deletion as you need to supply a second factor when attempting a delete.
CORRECT: “Enable versioning on the bucket” is a correct answer.
CORRECT: “Enable MFA Delete on the bucket” is also a correct answer.
INCORRECT: “Set read-only permissions on the bucket” is incorrect as this will also prevent any writing to the bucket which is not desired.
INCORRECT: “Attach an IAM policy to the bucket” is incorrect as users need to modify documents which will also allow delete. Therefore, a method must be implemented to just control deletes.
INCORRECT: “Encrypt the bucket using AWS SSE-S3” is incorrect as encryption doesn’t stop you from deleting an object.

Question 20:
A team are planning to run analytics jobs on log files each day and require a storage solution. The size and number of logs is unknown and data will persist for 24 hours only.
What is the MOST cost-effective solution?
Options:
A. Amazon S3 One-Zone Infrequent Access (S3 One Zone-IA)
B. Amazon S3 Standard
C. Amazon S3 Glacier Deep Archive
D. Amazon S3 Intelligent Tiering
Answer: B
Explanation
S3 standard is the best choice in this scenario for a short term storage solution. In this case the size and number of logs is unknown and it would be difficult to fully assess the access patterns at this stage. Therefore, using S3 standard is best as it is cost-effective, provides immediate access, and there are no retrieval fees or minimum capacity charge per object.
CORRECT: “Amazon S3 Standard” is the correct answer.
INCORRECT: “Amazon S3 Intelligent-Tiering” is incorrect as there is an additional fee for using this service and for a short-term requirement it may not be beneficial.
INCORRECT: “Amazon S3 One Zone-Infrequent Access (S3 One Zone-IA)” is incorrect as this storage class has a minimum capacity charge per object (128 KB) and a per GB retrieval fee.
INCORRECT: “Amazon S3 Glacier Deep Archive” is incorrect as this storage class is used for archiving data. There are retrieval fees and it take hours to retrieve data from an archive.

Question 21:
A solutions architect needs to backup some application log files from an online ecommerce store to Amazon S3. It is unknown how often the logs will be accessed or which logs will be accessed the most. The solutions architect must keep costs as low as possible by using the appropriate S3 storage class.
Which S3 storage class should be implemented to meet these requirements?
Options:
A. S3 Intelligent Tiering
B. S3 One Zone Infrequent Access (S3 One Zone-IA)
C. S3 Glacier
D. S3 Standard-Infrequent Access (S3 Standard-IA)
Answer: A
Explanation
The S3 Intelligent-Tiering storage class is designed to optimize costs by automatically moving data to the most cost-effective access tier, without performance impact or operational overhead.
It works by storing objects in two access tiers: one tier that is optimized for frequent access and another lower-cost tier that is optimized for infrequent access. This is an ideal use case for intelligent-tiering as the access patterns for the log files are not known.
CORRECT: “S3 Intelligent-Tiering” is the correct answer.
INCORRECT: “S3 Standard-Infrequent Access (S3 Standard-IA)” is incorrect as if the data is accessed often retrieval fees could become expensive.
INCORRECT: “S3 One Zone-Infrequent Access (S3 One Zone-IA)” is incorrect as if the data is accessed often retrieval fees could become expensive.
INCORRECT: “S3 Glacier” is incorrect as if the data is accessed often retrieval fees could become expensive. Glacier also requires more work in retrieving the data from the archive and quick access requirements can add further costs.

Question 22:
Which of the following features of an Amazon S3 bucket can only be suspended once they have been enabled?
Options:
A. Static Website Hosting
B. Versioning
C. Server Access Logging
D. Requester Pays
Answer: B
Explanation
Correct option:
Versioning
Once you version-enable a bucket, it can never return to an unversioned state. Versioning can only be suspended once it has been enabled.
Incorrect options:
Server Access Logging
Static Website Hosting
Requester Pays
Server Access Logging, Static Website Hosting and Requester Pays features can be disabled even after they have been enabled.

Question 23:
A healthcare startup needs to enforce compliance and regulatory guidelines for objects stored in Amazon S3. One of the key requirements is to provide adequate protection against accidental deletion of objects.
As a solutions architect, what are your recommendations to address these guidelines? (Select two)
Options:
A. Establish a process to get managerial approval for deleting S3 objects
B. Create an event trigger on deleting any S3 object. The event invokes an SNS notification via email to the IT manager
C. Enable versioning on the bucket
D. Change the configuration on AWS S3 console so that the user needs to provide additional confirmation while deleting any S3 object
E. Enable MFA delete on the bucket
Answer: C & E
Explanation
Correct options:
Enable versioning on the bucket – Versioning is a means of keeping multiple variants of an object in the same bucket. You can use versioning to preserve, retrieve, and restore every version of every object stored in your Amazon S3 bucket. Versioning-enabled buckets enable you to recover objects from accidental deletion or overwrite.
For example:
If you overwrite an object, it results in a new object version in the bucket. You can always restore the previous version. If you delete an object, instead of removing it permanently, Amazon S3 inserts a delete marker, which becomes the current object version. You can always restore the previous version. Hence, this is the correct option.
Enable MFA delete on the bucket – To provide additional protection, multi-factor authentication (MFA) delete can be enabled. MFA delete requires secondary authentication to take place before objects can be permanently deleted from an Amazon S3 bucket. Hence, this is the correct option.
Incorrect options:
Create an event trigger on deleting any S3 object. The event invokes an SNS notification via email to the IT manager – Sending an event trigger after object deletion does not meet the objective of preventing object deletion by mistake because the object has already been deleted. So, this option is incorrect.
Establish a process to get managerial approval for deleting S3 objects – This option for getting managerial approval is just a distractor.
Change the configuration on AWS S3 console so that the user needs to provide additional confirmation while deleting any S3 object – There is no provision to set up S3 configuration to ask for additional confirmation before deleting an object. This option is incorrect.

Question 24:
An audit department generates and accesses the audit reports only twice in a financial year. The department uses AWS Step Functions to orchestrate the report creating process that has failover and retry scenarios built into the solution. The underlying data to create these audit reports is stored on S3, runs into hundreds of Terabytes and should be available with millisecond latency.
As a solutions architect, which is the MOST cost-effective storage class that you would recommend to be used for this use-case?
Options:
A. Amazon S3 Standard-Infrequent Access (S3 Standard-IA)
B. Amazon S3 Glacier (S3 Glacier)
C. Amazon S3 Standard
D. Amazon S3 Intelligent-Tiering (S3 Intelligent-Tiering)
Answer: A
Explanation
Correct option:
Amazon S3 Standard-Infrequent Access (S3 Standard-IA)
Since the data is accessed only twice in a financial year but needs rapid access when required, the most cost-effective storage class for this use-case is S3 Standard-IA. S3 Standard-IA storage class is for data that is accessed less frequently but requires rapid access when needed. S3 Standard-IA matches the high durability, high throughput, and low latency of S3 Standard, with a low per GB storage price and per GB retrieval fee. Standard-IA is designed for 99.9% availability compared to 99.99% availability of S3 Standard. However, the report creation process has failover and retry scenarios built into the workflow, so in case the data is not available owing to the 99.9% availability of S3 Standard-IA, the job will be auto re-invoked till data is successfully retrieved. Therefore this is the correct option.
Incorrect options:
Amazon S3 Standard – S3 Standard offers high durability, availability, and performance object storage for frequently accessed data. As described above, S3 Standard-IA storage is a better fit than S3 Standard, hence using S3 standard is ruled out for the given use-case.
Amazon S3 Intelligent-Tiering (S3 Intelligent-Tiering) – The S3 Intelligent-Tiering storage class is designed to optimize costs by automatically moving data to the most cost-effective access tier, without performance impact or operational overhead. S3 Standard-IA matches the high durability, high throughput, and low latency of S3 Intelligent-Tiering, with a low per GB storage price and per GB retrieval fee. Moreover, Standard-IA has the same availability as that of S3 Intelligent-Tiering. So, it’s cost-efficient to use S3 Standard-IA instead of S3 Intelligent-Tiering.
Amazon S3 Glacier (S3 Glacier) – S3 Glacier on the other hand, is a secure, durable, and low-cost storage class for data archiving. S3 Glacier cannot support millisecond latency, so this option is ruled out.

Question 25:
The IT department at a consulting firm is conducting a training workshop for new developers. As part of an evaluation exercise on Amazon S3, the new developers were asked to identify the invalid storage class lifecycle transitions for objects stored on S3.
Can you spot the INVALID lifecycle transitions from the options below? (Select two)
Options:
A. S3 Intelligent-Tiering => S3 Standard
B. S3 One Zone-IA => S3 Standard-IA
C. S3 Standard => S3 Intelligent-Tiering
D. S3 Standard-IA => S3 Intelligent-Tiering
E. S3 Standard-IA => S3 One Zone-IA
Answer: A & B
Explanation
Correct options:
As the question wants to know about the INVALID lifecycle transitions, the following options are the correct answers –
S3 Intelligent-Tiering => S3 Standard
S3 One Zone-IA => S3 Standard-IA
Following are the unsupported life cycle transitions for S3 storage classes – Any storage class to the S3 Standard storage class. Any storage class to the Reduced Redundancy storage class. The S3 Intelligent-Tiering storage class to the S3 Standard-IA storage class. The S3 One Zone-IA storage class to the S3 Standard-IA or S3 Intelligent-Tiering storage classes.
Incorrect options:
S3 Standard => S3 Intelligent-Tiering
S3 Standard-IA => S3 Intelligent-Tiering
S3 Standard-IA => S3 One Zone-IA
Here are the supported life cycle transitions for S3 storage classes – The S3 Standard storage class to any other storage class. Any storage class to the S3 Glacier or S3 Glacier Deep Archive storage classes. The S3 Standard-IA storage class to the S3 Intelligent-Tiering or S3 One Zone-IA storage classes. The S3 Intelligent-Tiering storage class to the S3 One Zone-IA storage class. The S3 Glacier storage class to the S3 Glacier Deep Archive storage class.

Question 26:
A media agency stores its re-creatable assets on Amazon S3 buckets. The assets are accessed by a large number of users for the first few days and the frequency of access falls down drastically after a week. Although the assets would be accessed occasionally after the first week, but they must continue to be immediately accessible when required. The cost of maintaining all the assets on S3 storage is turning out to be very expensive and the agency is looking at reducing costs as much as possible.
As a Solutions Architect, can you suggest a way to lower the storage costs while fulfilling the business requirements?
A. Configure a lifecycle policy to transition the objects to Amazon S3 Standard-Infrequent Access (S3 Standard-IA) after 7 days
B. Configure a lifecycle policy to transition the objects to Amazon S3 One Zone-Infrequent Access (S3 One Zone-IA) after 30 days
C. Configure a lifecycle policy to transition the objects to Amazon S3 One Zone-Infrequent Access (S3 One Zone-IA) after 7 days
D. Configure a lifecycle policy to transition the objects to Amazon S3 Standard-Infrequent Access (S3 Standard-IA) after 30 days
Answer: B
Explanation
Correct option:
Configure a lifecycle policy to transition the objects to Amazon S3 One Zone-Infrequent Access (S3 One Zone-IA) after 30 days – S3 One Zone-IA is for data that is accessed less frequently, but requires rapid access when needed. Unlike other S3 Storage Classes which store data in a minimum of three Availability Zones (AZs), S3 One Zone-IA stores data in a single AZ and costs 20% less than S3 Standard-IA. S3 One Zone-IA is ideal for customers who want a lower-cost option for infrequently accessed and re-creatable data but do not require the availability and resilience of S3 Standard or S3 Standard-IA. The minimum storage duration is 30 days before you can transition objects from S3 Standard to S3 One Zone-IA.
S3 One Zone-IA offers the same high durability, high throughput, and low latency of S3 Standard, with a low per GB storage price and per GB retrieval fee. S3 Storage Classes can be configured at the object level, and a single bucket can contain objects stored across S3 Standard, S3 Intelligent-Tiering, S3 Standard-IA, and S3 One Zone-IA. You can also use S3 Lifecycle policies to automatically transition objects between storage classes without any application changes.
Incorrect options:
Configure a lifecycle policy to transition the objects to Amazon S3 Standard-Infrequent Access (S3 Standard-IA) after 7 days
Configure a lifecycle policy to transition the objects to Amazon S3 One Zone-Infrequent Access (S3 One Zone-IA) after 7 days
As mentioned earlier, the minimum storage duration is 30 days before you can transition objects from S3 Standard to S3 One Zone-IA or S3 Standard-IA, so both these options are added as distractors.
Configure a lifecycle policy to transition the objects to Amazon S3 Standard-Infrequent Access (S3 Standard-IA) after 30 days – S3 Standard-IA is for data that is accessed less frequently, but requires rapid access when needed. S3 Standard-IA offers the high durability, high throughput, and low latency of S3 Standard, with a low per GB storage price and per GB retrieval fee. This combination of low cost and high performance makes S3 Standard-IA ideal for long-term storage, backups, and as a data store for disaster recovery files. But, it costs more than S3 One Zone-IA because of the redundant storage across availability zones. As the data is re-creatable, so you don’t need to incur this additional cost.

Question 27:
A file-hosting service uses Amazon S3 under the hood to power its storage offerings. Currently all the customer files are uploaded directly under a single S3 bucket. The engineering team has started seeing scalability issues where customer file uploads have started failing during the peak access hours with more than 5000 requests per second.
Which of the following is the MOST resource efficient and cost-optimal way of addressing this issue?
A. Change the application architecture to create a new S3 bucket for each customer and then upload each customer’s files directly under the respective buckets
B. Change the application architecture to create customer-specific custom prefixes within the single bucket and then upload the daily files into those prefixed locations
C. Change the application architecture to create a new S3 bucket for each day’s data and then upload the daily files directly under that day’s bucket
D. Change the application architecture to use EFS instead of Amazon S3 for storing the customers’ uploaded files
Answer: B
Explanation
Correct option:
Change the application architecture to create customer-specific custom prefixes within the single bucket and then upload the daily files into those prefixed locations
Amazon Simple Storage Service (Amazon S3) is an object storage service that offers industry-leading scalability, data availability, security, and performance. Your applications can easily achieve thousands of transactions per second in request performance when uploading and retrieving storage from Amazon S3. Amazon S3 automatically scales to high request rates. For example, your application can achieve at least 3,500 PUT/COPY/POST/DELETE or 5,500 GET/HEAD requests per second per prefix in a bucket.
There are no limits to the number of prefixes in a bucket. You can increase your read or write performance by parallelizing reads. For example, if you create 10 prefixes in an Amazon S3 bucket to parallelize reads, you could scale your read performance to 55,000 read requests per second. Please see this example for more clarity on prefixes: if you have a file f1 stored in an S3 object path like so s3://your_bucket_name/folder1/sub_folder_1/f1, then /folder1/sub_folder_1/ becomes the prefix for file f1.
Some data lake applications on Amazon S3 scan millions or billions of objects for queries that run over petabytes of data. These data lake applications achieve single-instance transfer rates that maximize the network interface used for their Amazon EC2 instance, which can be up to 100 Gb/s on a single instance. These applications then aggregate throughput across multiple instances to get multiple terabits per second. Therefore creating customer-specific custom prefixes within the single bucket and then uploading the daily files into those prefixed locations is the BEST solution for the given constraints.
Incorrect options:
Change the application architecture to create a new S3 bucket for each customer and then upload each customer’s files directly under the respective buckets – Creating a new S3 bucket for each new customer is an inefficient way of handling resource availability (S3 buckets need to be globally unique) as some customers may use the service sparingly but the bucket name is locked for them forever. Moreover, this is really not required as we can use S3 prefixes to improve the performance.
Change the application architecture to create a new S3 bucket for each day’s data and then upload the daily files directly under that day’s bucket – Creating a new S3 bucket for each new day’s data is also an inefficient way of handling resource availability (S3 buckets need to be globally unique) as some of the bucket names may not be available for daily data processing. Moreover, this is really not required as we can use S3 prefixes to improve the performance.
Change the application architecture to use EFS instead of Amazon S3 for storing the customers’ uploaded files – EFS is a costlier storage option compared to S3, so it is ruled out.

Question 28:
A leading video streaming service delivers billions of hours of content from Amazon S3 to customers around the world. Amazon S3 also serves as the data lake for its big data analytics solution. The data lake has a staging zone where intermediary query results are kept only for 24 hours. These results are also heavily referenced by other parts of the analytics pipeline.
Which of the following is the MOST cost-effective strategy for storing this intermediary query data?
A. Store the intermediary query results in S3 Intelligent-Tiering storage class
B. Store the intermediary query results in S3 Standard-Infrequent Access storage class
C. Store the intermediary query results in S3 One Zone-Infrequent Access storage class
D. Store the intermediary query results in S3 Standard storage class
Answer: D
Explanation
Correct option:
Store the intermediary query results in S3 Standard storage class
S3 Standard offers high durability, availability, and performance object storage for frequently accessed data. Because it delivers low latency and high throughput, S3 Standard is appropriate for a wide variety of use cases, including cloud applications, dynamic websites, content distribution, mobile and gaming applications, and big data analytics. As there is no minimum storage duration charge and no retrieval fee (remember that intermediary query results are heavily referenced by other parts of the analytics pipeline), this is the MOST cost-effective storage class amongst the given options.
Incorrect options:
Store the intermediary query results in S3 Intelligent-Tiering storage class – The S3 Intelligent-Tiering storage class is designed to optimize costs by automatically moving data to the most cost-effective access tier, without performance impact or operational overhead. It works by storing objects in two access tiers: one tier that is optimized for frequent access and another lower-cost tier that is optimized for infrequent access. The minimum storage duration charge is 30 days, so this option is NOT cost-effective because intermediary query results need to be kept only for 24 hours. Hence this option is not correct.
Store the intermediary query results in S3 Standard-Infrequent Access storage class – S3 Standard-IA is for data that is accessed less frequently but requires rapid access when needed. S3 Standard-IA offers high durability, high throughput, and low latency of S3 Standard, with a low per GB storage price and per GB retrieval fee. This combination of low cost and high performance makes S3 Standard-IA ideal for long-term storage, backups, and as a data store for disaster recovery files. The minimum storage duration charge is 30 days, so this option is NOT cost-effective because intermediary query results need to be kept only for 24 hours. Hence this option is not correct.
Store the intermediary query results in S3 One Zone-Infrequent Access storage class – S3 One Zone-IA is for data that is accessed less frequently but requires rapid access when needed. Unlike other S3 Storage Classes which store data in a minimum of three Availability Zones (AZs), S3 One Zone-IA stores data in a single AZ and costs 20% less than S3 Standard-IA. The minimum storage duration charge is 30 days, so this option is NOT cost-effective because intermediary query results need to be kept only for 24 hours. Hence this option is not correct.
To summarize again, S3 Intelligent-Tiering, S3 Standard-IA, and S3 One Zone-IA have a minimum storage duration charge of 30 days (so instead of 24 hours, you end up paying for 30 days). S3 Standard-IA and S3 One Zone-IA also have retrieval charges (as the results are heavily referenced by other parts of the analytics pipeline, so the retrieval costs would be pretty high). Therefore, these 3 storage classes are not cost optimal for the given use-case.

Question 29:
A social photo-sharing company uses Amazon S3 to store the images uploaded by the users. These images are kept encrypted in S3 by using AWS-KMS and the company manages its own Customer Master Key (CMK) for encryption. A member of the DevOps team accidentally deleted the CMK a day ago, thereby rendering the user’s photo data unrecoverable. You have been contacted by the company to consult them on possible solutions to this crisis.
As a solutions architect, which of the following steps would you recommend to solve this issue?
Options:
A. Contact AWS support to retrieve the CMK from their backup
B. The CMK can be recovered by the AWS root account user
C. The company should issue a notification on its web application informing the users about the loss of their data
D. As the CMK was deleted a day ago, it must be in the ‘pending deletion’ status and hence you can just cancel the CMK deletion and recover the key
Answer: D
Explanation
Correct option:
As the CMK was deleted a day ago, it must be in the ‘pending deletion’ status and hence you can just cancel the CMK deletion and recover the key
AWS Key Management Service (KMS) makes it easy for you to create and manage cryptographic keys and control their use across a wide range of AWS services and in your applications. AWS KMS is a secure and resilient service that uses hardware security modules that have been validated under FIPS 140-2.
Deleting a customer master key (CMK) in AWS Key Management Service (AWS KMS) is destructive and potentially dangerous. Therefore, AWS KMS enforces a waiting period. To delete a CMK in AWS KMS you schedule key deletion. You can set the waiting period from a minimum of 7 days up to a maximum of 30 days. The default waiting period is 30 days. During the waiting period, the CMK status and key state is Pending deletion. To recover the CMK, you can cancel key deletion before the waiting period ends. After the waiting period ends you cannot cancel key deletion, and AWS KMS deletes the CMK.
Incorrect options:
Contact AWS support to retrieve the CMK from their backup
The CMK can be recovered by the AWS root account user
The AWS root account user cannot recover CMK and the AWS support does not have access to CMK via any backups. Both these options just serve as distractors.
The company should issue a notification on its web application informing the users about the loss of their data – This option is not required as the data can be recovered via the cancel key deletion feature.

Question 30:
A company uses Amazon S3 buckets for storing sensitive customer data. The company has defined different retention periods for different objects present in the Amazon S3 buckets, based on the compliance requirements. But, the retention rules do not seem to work as expected.
Which of the following options represent a valid configuration for setting up retention periods for objects in Amazon S3 buckets? (Select two)
Options:
A. When you apply a retention period to an object version explicitly, you specify a Retain Until Date for the object version
B. You cannot place a retention period on an object version through a bucket default setting
C. When you use bucket default settings, you specify a Retain Until Date for the object version
D. Different versions of a single object can have different retention modes and periods
E. The bucket default settings will override any explicit retention mode or period you request on an object version
Answer: A & D
Explanation
Correct options:
When you apply a retention period to an object version explicitly, you specify a Retain Until Date for the object version – You can place a retention period on an object version either explicitly or through a bucket default setting. When you apply a retention period to an object version explicitly, you specify a Retain Until Date for the object version. Amazon S3 stores the Retain Until Date setting in the object version’s metadata and protects the object version until the retention period expires.
Different versions of a single object can have different retention modes and periods – Like all other Object Lock settings, retention periods apply to individual object versions. Different versions of a single object can have different retention modes and periods.
For example, suppose that you have an object that is 15 days into a 30-day retention period, and you PUT an object into Amazon S3 with the same name and a 60-day retention period. In this case, your PUT succeeds, and Amazon S3 creates a new version of the object with a 60-day retention period. The older version maintains its original retention period and becomes deletable in 15 days.
Incorrect options:
You cannot place a retention period on an object version through a bucket default setting – You can place a retention period on an object version either explicitly or through a bucket default setting.
When you use bucket default settings, you specify a Retain Until Date for the object version – When you use bucket default settings, you don’t specify a Retain Until Date. Instead, you specify a duration, in either days or years, for which every object version placed in the bucket should be protected.
The bucket default settings will override any explicit retention mode or period you request on an object version – If your request to place an object version in a bucket contains an explicit retention mode and period, those settings override any bucket default settings for that object version.

Question 31:
A data analytics company measures what the consumers watch and what advertising they’re exposed to. This real-time data is ingested into its on-premises data center and subsequently, the daily data feed is compressed into a single file and uploaded on Amazon S3 for backup. The typical compressed file size is around 2 GB.
Which of the following is the fastest way to upload the daily compressed file into S3?
Options:
A. Upload the compressed file using multipart upload with S3 transfer acceleration
B. Upload the compressed file in a single operation
C. Upload the compressed file using multipart upload
D. FTP the compressed file into an EC2 instance that runs in the same region as the S3 bucket. Then transfer the file from the EC2 instance into the S3 bucket
Answer: A
Explanation
Correct option:
Upload the compressed file using multipart upload with S3 transfer acceleration
Amazon S3 Transfer Acceleration enables fast, easy, and secure transfers of files over long distances between your client and an S3 bucket. Transfer Acceleration takes advantage of Amazon CloudFront’s globally distributed edge locations. As the data arrives at an edge location, data is routed to Amazon S3 over an optimized network path.
Multipart upload allows you to upload a single object as a set of parts. Each part is a contiguous portion of the object’s data. You can upload these object parts independently and in any order. If transmission of any part fails, you can retransmit that part without affecting other parts. After all parts of your object are uploaded, Amazon S3 assembles these parts and creates the object. If you’re uploading large objects over a stable high-bandwidth network, use multipart uploading to maximize the use of your available bandwidth by uploading object parts in parallel for multi-threaded performance. If you’re uploading over a spotty network, use multipart uploading to increase resiliency to network errors by avoiding upload restarts.
Incorrect options:
Upload the compressed file in a single operation – In general, when your object size reaches 100 MB, you should consider using multipart uploads instead of uploading the object in a single operation. Multipart upload provides improved throughput – you can upload parts in parallel to improve throughput. Therefore, this option is not correct.
Upload the compressed file using multipart upload – Although using multipart upload would certainly speed up the process, combining with S3 transfer acceleration would further improve the transfer speed. Therefore just using multipart upload is not the correct option.
FTP the compressed file into an EC2 instance that runs in the same region as the S3 bucket. Then transfer the file from the EC2 instance into the S3 bucket – This is a roundabout process of getting the file into S3 and added as a distractor. Although it is technically feasible to follow this process, it would involve a lot of scripting and certainly would not be the fastest way to get the file into S3.

Question 32:
A technology blogger wants to write a review on the comparative pricing for various storage types available on AWS Cloud. The blogger has created a test file of size 1GB with some random data. Next he copies this test file into AWS S3 Standard storage class, provisions an EBS volume (General Purpose SSD (gp2)) with 100GB of provisioned storage and copies the test file into the EBS volume, and lastly copies the test file into an EFS Standard Storage filesystem. At the end of the month, he analyses the bill for costs incurred on the respective storage types for the test file.
What is the correct order of the storage charges incurred for the test file on these three storage types?
Options:
A. Cost of test file storage on S3 Standard < Cost of test file storage on EBS < Cost of test file storage on EFS
B. Cost of test file storage on S3 Standard < Cost of test file storage on EFS < Cost of test file storage on EBS
C. Cost of test file storage on EFS < Cost of test file storage on S3 Standard < Cost of test file storage on EBS
D. Cost of test file storage on EBS < Cost of test file storage on S3 Standard < Cost of test file storage on EFS
Answer: B
Explanation
Correct option:
Cost of test file storage on S3 Standard < Cost of test file storage on EFS < Cost of test file storage on EBS
With Amazon EFS, you pay only for the resources that you use. The EFS Standard Storage pricing is $0.30 per GB per month. Therefore the cost for storing the test file on EFS is $0.30 for the month.
For EBS General Purpose SSD (gp2) volumes, the charges are $0.10 per GB-month of provisioned storage. Therefore, for a provisioned storage of 100GB for this use-case, the monthly cost on EBS is $0.10*100 = $10. This cost is irrespective of how much storage is actually consumed by the test file.
For S3 Standard storage, the pricing is $0.023 per GB per month. Therefore, the monthly storage cost on S3 for the test file is $0.023.
Therefore this is the correct option.
Incorrect options:
Cost of test file storage on S3 Standard < Cost of test file storage on EBS < Cost of test file storage on EFS
Cost of test file storage on EFS < Cost of test file storage on S3 Standard < Cost of test file storage on EBS
Cost of test file storage on EBS < Cost of test file storage on S3 Standard < Cost of test file storage on EFS
Following the computations shown earlier in the explanation, these three options are incorrect.

Question 33:
An IT company provides S3 bucket access to specific users within the same account for completing project specific work. With changing business requirements, cross-account S3 access requests are also growing every month. The company is looking for a solution that can offer user level as well as account-level access permissions for the data stored in S3 buckets.
As a Solutions Architect, which of the following would you suggest as the MOST optimized way of controlling access for this use-case?
A• Use Security Groups
B• Use Amazon S3 Bucket Policies
C• Use Identity and Access Management (IAM) policies
D• Use Access Control Lists (ACLs)
Answer: B
Explanation
Correct option:
Use Amazon S3 Bucket Policies
Bucket policies in Amazon S3 can be used to add or deny permissions across some or all of the objects within a single bucket. Policies can be attached to users, groups, or Amazon S3 buckets, enabling centralized management of permissions. With bucket policies, you can grant users within your AWS Account or other AWS Accounts access to your Amazon S3 resources.
You can further restrict access to specific resources based on certain conditions. For example, you can restrict access based on request time (Date Condition), whether the request was sent using SSL (Boolean Conditions), a requester’s IP address (IP Address Condition), or based on the requester’s client application (String Conditions). To identify these conditions, you use policy keys.
Incorrect options:
Use Identity and Access Management (IAM) policies – AWS IAM enables organizations with many employees to create and manage multiple users under a single AWS account. IAM policies are attached to the users, enabling centralized control of permissions for users under your AWS Account to access buckets or objects. With IAM policies, you can only grant users within your own AWS account permission to access your Amazon S3 resources. So, this is not the right choice for the current requirement.
Use Access Control Lists (ACLs) – Within Amazon S3, you can use ACLs to give read or write access on buckets or objects to groups of users. With ACLs, you can only grant other AWS accounts (not specific users) access to your Amazon S3 resources. So, this is not the right choice for the current requirement.
Use Security Groups – A security group acts as a virtual firewall for EC2 instances to control incoming and outgoing traffic. S3 does not support Security Groups, this option just acts as a distractor.