36. EFS

EFS = Elastic File System
One of the storage option for EC2 instance. EFS is a ‘Managed Network File System’ and can be mounted on 100s of EC2 instances at a time. EBS volumes are attached to only one EC2 instance but EFS can mount up to 100s of ECS instances. EFS works only with Linux EC2 instances and works across multiple AZs. EFS is highly available, scalable, expensive (3*gp2), pay per use and no capacity planning.

Amazon EFS is a regional service storing data within and across multiple Availability Zones (AZs) for high availability and durability. Amazon EC2 instances can access your file system across AZs, regions, and VPCs, while on-premises servers can access using AWS Direct Connect or AWS VPN.

AZ >> Security Group >> EFS

EBS EFS
AZ >> EC2 Instance >> EBS Volume
EBS Volumes are bound to specific AZ. Many EC2 instances in AZ1 and many instances in AZ2, using mount target we can mount all the instances at same time
Create a snapshot and restore a snapshot to move EBS Volume from one AZ to another AZ.
i. File storage service for EC2 instances.
ii. EFS is easy to use & provides a simple interface that allows to create & configure file systems quickly and easily.
iii. EFS storage capacity is elastic. It grows and shrinks automatically when we add or remove files, so the applications have the storage they need, when they need it.
iv. EFS supports NFS v4 protocol
v. With EFS we only pay for the storage we use (no pre-provisioning required)
vi. Can scale upto the petabytes
vii. Can support thousands of concurrent NFS connections
viii. Data is stored across multiple AZs within a region
ix. Read after write consistency

Question 1:
The sourcing team at the US headquarters of a global e-commerce company is preparing a spreadsheet of the new product catalog. The spreadsheet is saved on an EFS file system created in us-east-1 region. The sourcing team counterparts from other AWS regions such as Asia Pacific and Europe also want to collaborate on this spreadsheet. As a solutions architect, what is your recommendation to enable this collaboration with the LEAST amount of operational overhead?
Answer: The spreadsheet on the EFS file system can be accessed from EC2 instances running in other AWS regions by using an inter-region VPC peering connection. Copying the spreadsheet into S3 or RDS database is not the correct solution as it involves a lot of operational overhead. For RDS, one would need to write custom code to replicate the spreadsheet functionality. S3 does not allow in-place edit of an object. Additionally, it’s also not POSIX compliant. So one would need to develop a custom application to “simulate in-place edits” to support collaboration as per the use-case. Creating copies of the spreadsheet into EFS file systems of other AWS regions would mean no collaboration would be possible between the teams. In this case, each team would work on “its own file” instead of a single file accessed and updated by all teams.

Question 2:
A company has moved its business critical data to Amazon EFS file system which will be accessed by multiple EC2 instances. As an AWS Certified Solutions Architect Associate, which of the following would you recommend to exercise access control such that only the permitted EC2 instances can read from the EFS file system? (Select three)
Answer: a. Use VPC security group to control the n/w traffic to and from your file system
b. Use EFS access points to manage application access
c. Attach an IAM policy to your file system to control clients who can mount your file system with the required permissions.

Question 3:
One company is building file storage using AWS. This storage requirement requires the use of data transfer over the NFSv4 protocol.
Choose a storage type that meets this requirement.
Options:
A. Amazon FSx
B. EBS
C. EFS
D. S3 Standard
Answer: C
Explanation
Amazon EFS uses a file permission model based on the NFSv4 protocol, file locking performance, with a hierarchical directory structure to enable secure access from thousands of EC2 instances and on-premises servers. Therefore, option 3 is the correct answer.
Option 1 is incorrect. Amazon FSx is an NTFS file system that is accessible to up to thousands of compute instances based on the SMB protocol.
Option 2 is incorrect. Amazon Elastic Block Store (EBS) does not use the NFSv4 protocol.
Option 4 is incorrect. S3 can transfer files directly using Secure File Transfer Protocol (SFTP).

Question 4:
Your company operates a set of EC2 instances hosted on AWS. These are all Linux-based instances and require access to shared data via a standard file interface. Since it is used by multiple instances, the storage where the data is stored requires strong integrity and file locking. So, as a Solutions Architect, you are looking for the best storage option.
Choose the best storage option that meets this requirement.
Options:
A. EFS
B. S3
C. EBS
D. Glacier
Answer: A
Explanation
Option 1 is the correct answer. EFS allows multiple EC2 instances to access the EFS file system and share data at the same time. EFS provides a file system interface and file system access semantics (such as strong consistency and file locks) that allow simultaneous access from up to thousands of Amazon EC2 instances.
Option 2 is incorrect. S3 is an object storage service. S3 can use stored data from anywhere via the Internet API. It can be used from multiple instances, but it cannot meet all requirements, such as file locks.
Option 3 is incorrect. Amazon EBS is a block-level storage service dedicated to Amazon EC2. With the exception of some instances, data cannot be shared between EC2 instances and so does not meet the requirements.
Option 4 is incorrect. Glacier is a storage for medium- to long-term storage and cannot be used for frequently accessed data.

Question 5:
A company is deploying a fleet of Amazon EC2 instances running Linux across multiple Availability Zones within an AWS Region. The application requires a data storage solution that can be accessed by all of the EC2 instances simultaneously. The solution must be highly scalable and easy to implement. The storage must be mounted using the NFS protocol.
Which solution meets these requirements?
Options:
A. Create an Amazon RDS database and store the data in a BLOB format. Point the application instances to the RDS endpoint
B. Create an Amazon EFS file system with mount targets in each Availability Zone. Configure the application instances to mount the file system
C. Create an Amazon EBS volume and use EBS Multi-Attach to mount the volume to all EC2 instances across each Availability Zone
D. Create an Amazon S3 bucket and create an S3 gateway endpoint to allow access to the file system using the NFS protocol
Answer: B
Explanation
Amazon EFS provides scalable file storage for use with Amazon EC2. You can use an EFS file system as a common data source for workloads and applications running on multiple instances. The EC2 instances can run in multiple AZs within a Region and the NFS protocol is used to mount the file system.
With EFS you can create mount targets in each AZ for lower latency. The application instances in each AZ will mount the file system using the local mount target.
CORRECT: “Create an Amazon EFS file system with mount targets in each Availability Zone. Configure the application instances to mount the file system” is the correct answer.
INCORRECT: “Create an Amazon S3 bucket and create an S3 gateway endpoint to allow access to the file system using the NFS protocol” is incorrect. You cannot use NFS with S3 or with gateway endpoints.
INCORRECT: “Create an Amazon EBS volume and use EBS Multi-Attach to mount the volume to all EC2 instances across each Availability Zone” is incorrect. You cannot use Amazon EBS Multi-Attach across multiple AZs.
INCORRECT: “Create an Amazon RDS database and store the data in a BLOB format. Point the application instances to the RDS endpoint” is incorrect. This is not a suitable storage solution for a file system that is mounted over NFS.

Question 6:
An application is being created that will use Amazon EC2 instances to generate and store data. Another set of EC2 instances will then analyze and modify the data. Storage requirements will be significant and will continue to grow over time. The application architects require a storage solution.
Which actions would meet these needs?
Options:
A. Store the data in an Amazon EBS volume. Mount the EBS volume on the application instances
B. Store the data in Amazon S3 Glacier. Update the vault policy to allow access to the application instances
C. Store the data in an Amazon EFS filesystem. Mount the file system on the application instances
D. Store the data in AWS Storage Gateway. Setup AWS Direct Connect between the Gateway appliance and the EC2 instances
Answer: C
Explanation
Amazon Elastic File System (Amazon EFS) provides a simple, scalable, fully managed elastic NFS file system for use with AWS Cloud services and on-premises resources. It is built to scale on-demand to petabytes without disrupting applications, growing and shrinking automatically as you add and remove files, eliminating the need to provision and manage capacity to accommodate growth.
Amazon EFS supports the Network File System version 4 (NFSv4.1 and NFSv4.0) protocol. Multiple Amazon EC2 instances can access an Amazon EFS file system at the same time, providing a common data source for workloads and applications running on more than one instance or server.
For this scenario, EFS is a great choice as it will provide a scalable file system that can be mounted by multiple EC2 instances and accessed simultaneously.
CORRECT: “Store the data in an Amazon EFS filesystem. Mount the file system on the application instances” is the correct answer.
INCORRECT: “Store the data in an Amazon EBS volume. Mount the EBS volume on the application instances” is incorrect. Though there is a new feature that allows (EBS multi-attach) that allows attaching multiple Nitro instances to a volume, this is not on the exam yet, and has some specific constraints.
INCORRECT: “Store the data in Amazon S3 Glacier. Update the vault policy to allow access to the application instances” is incorrect as S3 Glacier is not a suitable storage location for live access to data, it is used for archival.
INCORRECT: “Store the data in AWS Storage Gateway. Setup AWS Direct Connect between the Gateway appliance and the EC2 instances” is incorrect. There is no reason to store the data on-premises in a Storage Gateway, using EFS is a much better solution.

Question 7:
You would like to mount a network file system on Linux instances, where files will be stored and accessed frequently at first, and then infrequently. What solution is the MOST cost-effective?
• S3 Intelligent Tiering
• Glacier Deep Archive
• EFS IA (Correct)
• FSx for Lustre
Explanation
Correct option:
EFS IA
Amazon Elastic File System (Amazon EFS) provides a simple, scalable, fully managed elastic NFS file system for use with AWS Cloud services and on-premises resources. Amazon EFS is a regional service storing data within and across multiple Availability Zones (AZs) for high availability and durability.
Amazon EFS Infrequent Access (EFS IA) is a storage class that provides price/performance that is cost-optimized for files, not accessed every day, with storage prices up to 92% lower compared to Amazon EFS Standard. Therefore, this is the correct option.
Incorrect options:
S3 Intelligent Tiering – Amazon Simple Storage Service (Amazon S3) is an object storage service that offers industry-leading scalability, data availability, security, and performance. The S3 Intelligent-Tiering storage class is designed to optimize costs by automatically moving data to the most cost-effective access tier, without performance impact or operational overhead. It works by storing objects in two access tiers: one tier that is optimized for frequent access and another lower-cost tier that is optimized for infrequent access.
You can’t mount a network file system on S3 Intelligent Tiering as it’s an object storage service, so this option is incorrect.
Glacier Deep Archive – Amazon S3 Glacier and S3 Glacier Deep Archive are a secure, durable, and extremely low-cost Amazon S3 cloud storage classes for data archiving and long-term backup. They are designed to deliver 99.999999999% durability, and provide comprehensive security and compliance capabilities that can help meet even the most stringent regulatory requirements.
You can’t mount a network file system on S3 Intelligent Tiering as it’s an object storage/archival service, so this option is incorrect.
FSx for Lustre – Amazon FSx for Lustre makes it easy and cost-effective to launch and run the world’s most popular high-performance file system. It is used for workloads such as machine learning, high-performance computing (HPC), video processing, and financial modeling. Amazon FSx enables you to use Lustre file systems for any workload where storage speed matters.
FSx for Lustre is a file system better suited for distributed computing for HPC (high-performance computing) and is very expensive

Question 8:
A startup has just developed a video backup service hosted on a fleet of EC2 instances. The EC2 instances are behind an Application Load Balancer and the instances are using EBS volumes for storage. The service provides authenticated users the ability to upload videos that are then saved on the EBS volume attached to a given instance. On the first day of the beta launch, users start complaining that they can see only some of the videos in their uploaded videos backup. Every time the users log into the website, they claim to see a different subset of their uploaded videos.
Which of the following is the MOST optimal solution to make sure that users can view all the uploaded videos? (Select two)
A• Mount EFS on all EC2 instances. Write a one time job to copy the videos from all EBS volumes to EFS. Modify the application to use EFS for storing the videos
B• Write a one time job to copy the videos from all EBS volumes to S3 Glacier Deep Archive and then modify the application to use S3 Glacier Deep Archive for storing the videos
C• Write a one time job to copy the videos from all EBS volumes to S3 and then modify the application to use Amazon S3 standard for storing the videos
D• Write a one time job to copy the videos from all EBS volumes to DynamoDB and then modify the application to use DynamoDB for storing the videos
E• Write a one time job to copy the videos from all EBS volumes to RDS and then modify the application to use RDS for storing the videos
Answer: A & C
Explanation
Correct options:
Write a one time job to copy the videos from all EBS volumes to S3 and then modify the application to use Amazon S3 standard for storing the videos
Mount EFS on all EC2 instances. Write a one time job to copy the videos from all EBS volumes to EFS. Modify the application to use EFS for storing the videos
Amazon Elastic Block Store (EBS) is an easy to use, high-performance block storage service designed for use with Amazon Elastic Compute Cloud (EC2) for both throughput and transaction-intensive workloads at any scale.
Amazon Elastic File System (Amazon EFS) provides a simple, scalable, fully managed elastic NFS file system for use with AWS Cloud services and on-premises resources. It is built to scale on-demand to petabytes without disrupting applications, growing and shrinking automatically as you add and remove files, eliminating the need to provision and manage capacity to accommodate growth.
Amazon Simple Storage Service (Amazon S3) is an object storage service that offers industry-leading scalability, data availability, security, and performance.
As EBS volumes are attached locally to the EC2 instances, therefore the uploaded videos are tied to specific EC2 instances. Every time the user logs in, they are directed to a different instance and therefore their videos get dispersed across multiple EBS volumes. The correct solution is to use either S3 or EFS to store the user videos.
Incorrect options:
Write a one time job to copy the videos from all EBS volumes to S3 Glacier Deep Archive and then modify the application to use S3 Glacier Deep Archive for storing the videos – Glacier Deep Archive is meant to be used for long term data archival. It cannot be used to serve static content such as videos or images via a web application. So this option is incorrect.
Write a one time job to copy the videos from all EBS volumes to RDS and then modify the application to use RDS for storing the videos – RDS is a relational database and not the right candidate for storing videos.
Write a one time job to copy the videos from all EBS volumes to DynamoDB and then modify the application to use DynamoDB for storing the videos – DynamoDB is a NoSQL database and not the right candidate for storing videos.

Question 9:
You are building an application that will be deployed on 10 EC2 instances using Amazon Linux 2 AMI. The application needs access to a shared network file system that is POSIX compliant.
What do you recommend?
A• EFS
B• S3
C• Instance Store
D• EBS
Answer: A
Explanation
Correct option:
EFS
Amazon Elastic File System (Amazon EFS) provides a simple, scalable, fully managed elastic NFS file system for use with AWS Cloud services and on-premises resources. Amazon EFS is a regional service storing data within and across multiple Availability Zones (AZs) for high availability and durability. Here, EFS is the perfect solution for this use case.
Incorrect options:
EBS – Amazon Elastic Block Store (EBS) is an easy to use, high-performance block storage service designed for use with Amazon Elastic Compute Cloud (EC2) for both throughput and transaction-intensive workloads at any scale. A broad range of workloads, such as relational and non-relational databases, enterprise applications, containerized applications, big data analytics engines, file systems, and media workflows are widely deployed on Amazon EBS.
EBS is POSIX compliant but not a shared storage. EBS volumes are locked to an AZ and mounted on one EC2 instance at a time.
S3 – Amazon Simple Storage Service (Amazon S3) is an object storage service that offers industry-leading scalability, data availability, security, and performance.
S3 is a shared storage but not POSIX compliant. It cannot be mounted as a file system with the standard tools.
Instance Store – An Instance Store provides temporary block-level storage for your instance. This storage is located on disks that are physically attached to the host computer. Instance store is ideal for the temporary storage of information that changes frequently, such as buffers, caches, scratch data, and other temporary content, or for data that is replicated across a fleet of instances, such as a load-balanced pool of web servers. Instance Store is POSIX compliant but not a shared storage.

Question 10:
A company is looking at storing their less frequently accessed files on AWS that can be concurrently accessed by hundreds of EC2 instances. The company needs the most cost-effective file storage service that provides immediate access to data whenever needed.
Which of the following options represents the best solution for the given requirements?
A• Amazon Elastic File System (EFS) Standard–IA storage class
B• Amazon Elastic File System (EFS) Standard storage class
C• Amazon S3 Standard-Infrequent Access (S3 Standard-IA) storage class
D• Amazon Elastic Block Store (EBS)
Answer: A
Explanation
Correct option:
Amazon Elastic File System (EFS) Standard–IA storage class – Amazon EFS is a file storage service for use with Amazon compute (EC2, containers, serverless) and on-premises servers. Amazon EFS provides a file system interface, file system access semantics (such as strong consistency and file locking), and concurrently accessible storage for up to thousands of Amazon EC2 instances.
The Standard–IA storage class reduces storage costs for files that are not accessed every day. It does this without sacrificing the high availability, high durability, elasticity, and POSIX file system access that Amazon EFS provides. AWS recommends Standard-IA storage if you need your full dataset to be readily accessible and want to automatically save on storage costs for files that are less frequently accessed.
Incorrect options:
Amazon S3 Standard-Infrequent Access (S3 Standard-IA) storage class – Amazon S3 is an object storage service. Amazon S3 makes data available through an Internet API that can be accessed anywhere. It is not a file storage service, as is needed in the use case.
Amazon Elastic File System (EFS) Standard storage class – Amazon EFS Standard storage classes are ideal for workloads that require the highest levels of durability and availability. The EFS Standard storage class is used for frequently accessed files. It is the storage class to which customer data is initially written for Standard storage classes. The company is also looking at cutting costs by optimally storing the infrequently accessed data. Hence, EFS standard storage class is not the right solution for the given use case.
Amazon Elastic Block Store (EBS) – Amazon EBS is a block-level storage service for use with Amazon EC2. Amazon EBS can deliver performance for workloads that require the lowest latency access to data from a single EC2 instance. EBS volume cannot be accessed by hundreds of EC2 instances concurrently. It is not a file storage service, as is needed in the use case.

Question 11:
An analytics company wants to improve the performance of its big data processing workflows running on Amazon EFS. Which of the following performance modes should be used for EFS to address this requirement?
A• Provisioned Throughput
B• General Purpose
C• Max I/O
D• Bursting Throughput
Answer: C
Explanation
Correct option:
Max I/O
Max I/O performance mode is used to scale to higher levels of aggregate throughput and operations per second. This scaling is done with a tradeoff of slightly higher latencies for file metadata operations. Highly parallelized applications and workloads, such as big data analysis, media processing, and genomic analysis, can benefit from this mode.
Incorrect options:
Provisioned Throughput
Bursting Throughput
These two options have been added as distractors as these refer to the throughput mode of EFS and not the performance mode. There are two throughput modes to choose from for your file system, Bursting Throughput and Provisioned Throughput. With Bursting Throughput mode, throughput on Amazon EFS scales as the size of your file system in the standard storage class grows. With Provisioned Throughput mode, you can instantly provision the throughput of your file system (in MiB/s) independent of the amount of data stored.
General Purpose – General Purpose performance mode is ideal for latency-sensitive use cases, like web serving environments, content management systems, home directories, and general file serving. If you don’t choose a performance mode when you create your file system, Amazon EFS selects the General Purpose mode for you by default.