49. Database Migration Services (DMS)

i. AWS DMS is a cloud service that makes it easy to migrate relational DBs, data warehouses, no SQL DBs and other types of data stores. We can use AWS DMS to migrate our data into AWS cloud between on-premises instances (through an AWS cloud setup) or between combinations of cloud & on premises setup.
ii. DMS work: AWS DMS is a server in the AWS cloud that runs replication software. We create a source and target connection to tell DMS where to extract from and load to. Then we schedule a task that runs on the server to move data. AWS DMS creates the tables and associated primary key if they do not exist in the target. We can pre-create the target tables manually or we can use AWS schema conversion tool (SCT) to create some or all of the target tables, indexes, views, triggers etc.

Types of DMS:
i. Homogeneous migrations – Oracle (on-premise) –> Oracle (AWS Cloud)
ii. Heterogeneous migrations – Microsoft SQL Server (on-premise) –> Amazon Aurora (AWS Cloud)

Sources and Targets:

Sources Targets
On-premises & EC2 instances DBs. Oracle, Microsoft SQL server, MySQL, Maria DB, Postgresql, SAP, Mongo DB, DB2 On-premises & EC2 instances DBs. Oracle, Microsoft SQL server, MySQL, Maria DB, Postgresql, SAP
Azure SQL DB RDS
Amazon RDS (including Aurora) Redshift
Amazon S3 Dynamo DB
S3
Elastisearch
Kinesis data streams
Document DB
i. Homogeneous Migrations: on-premises DB —> EC2 instance running DMS —> RDS
ii. Heterogeneous Migrations: on-premises DB —> EC2 instance running DMS + SCT —> RDS
iii. We do not need SCT (Schema Conversion Tool) in homogeneous migrations (identical DBs). We need SCT for heterogeneous migrations.
iv. DMS allows to migrate DBs from one source to AWS
iv. The source can either be on-premises or inside AWS itself or another cloud provides such as Microsoft Azure

Question 1:
The database tier of a web application is running on a Windows server on-premises. The database is a Microsoft SQL Server database. The application owner would like to migrate the database to an Amazon RDS instance.
How can the migration be executed with minimal administrative effort and downtime?
Options:
A. Use the AWS Server Migration Service (SMS) to migrate the server to Amazon EC2. Use AWS Database Migration Service (DMS) to migrate the database to RDS
B. Use the AWS Database Migration Service (DMS) to directly migrate the database to RDS
C. Use AWS DataSync to migrate the data from the database to Amazon S3. Use AWS Database Migration Service (DMS) to migrate the database to RDS
D. Use the AWS Database Migration Service (DMS) to directly migrate the database to RDS. Use the Schema Conversion Tool (SCT) to enable conversion from Microsoft SQL Server to Amazon RDS
Answer: B
Explanation
You can directly migrate Microsoft SQL Server from an on-premises server into Amazon RDS using the Microsoft SQL Server database engine. This can be achieved using the native Microsoft SQL Server tools, or using AWS DMS.
CORRECT: “Use the AWS Database Migration Service (DMS) to directly migrate the database to RDS” is the correct answer.
INCORRECT: “Use the AWS Server Migration Service (SMS) to migrate the server to Amazon EC2. Use AWS Database Migration Service (DMS) to migrate the database to RDS” is incorrect. You do not need to use the AWS SMS service to migrate the server into EC2 first. You can directly migrate the database online with minimal downtime.
INCORRECT: “Use AWS DataSync to migrate the data from the database to Amazon S3. Use AWS Database Migration Service (DMS) to migrate the database to RDS” is incorrect. AWS DataSync is used for migrating data, not databases.
INCORRECT: “Use the AWS Database Migration Service (DMS) to directly migrate the database to RDS. Use the Schema Conversion Tool (SCT) to enable conversion from Microsoft SQL Server to Amazon RDS” is incorrect. You do not need to use the SCT as you are migrating into the same destination database engine (RDS is just the platform).

Question 2:
The business analytics team at a company has been running ad-hoc queries on Oracle and PostgreSQL services on Amazon RDS to prepare daily reports for senior management. To facilitate the business analytics reporting, the engineering team now wants to continuously replicate this data and consolidate these databases into a petabyte-scale data warehouse by streaming data to Amazon Redshift.
As a solutions architect, which of the following would you recommend as the MOST resource-efficient solution that requires the LEAST amount of development time without the need to manage the underlying infrastructure?
• Use AWS EMR to replicate the data from the databases into Amazon Redshift
• Use Amazon Kinesis Data Streams to replicate the data from the databases into Amazon Redshift
• Use AWS Glue to replicate the data from the databases into Amazon Redshift
• Use AWS Database Migration Service to replicate the data from the databases into Amazon Redshift
Answer: D
Explanation
Correct option:
Use AWS Database Migration Service to replicate the data from the databases into Amazon Redshift
AWS Database Migration Service helps you migrate databases to AWS quickly and securely. The source database remains fully operational during the migration, minimizing downtime to applications that rely on the database. With AWS Database Migration Service, you can continuously replicate your data with high availability and consolidate databases into a petabyte-scale data warehouse by streaming data to Amazon Redshift and Amazon S3.
Continuous Data Replication via – https://aws.amazon.com/dms/
You can migrate data to Amazon Redshift databases using AWS Database Migration Service. Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud. With an Amazon Redshift database as a target, you can migrate data from all of the other supported source databases.
The Amazon Redshift cluster must be in the same AWS account and the same AWS Region as the replication instance. During a database migration to Amazon Redshift, AWS DMS first moves data to an Amazon S3 bucket. When the files reside in an Amazon S3 bucket, AWS DMS then transfers them to the proper tables in the Amazon Redshift data warehouse. AWS DMS creates the S3 bucket in the same AWS Region as the Amazon Redshift database. The AWS DMS replication instance must be located in that same region.
Incorrect options:
Use AWS Glue to replicate the data from the databases into Amazon Redshift – AWS Glue is a fully managed extract, transform, and load (ETL) service that makes it easy for customers to prepare and load their data for analytics. AWS Glue job is meant to be used for batch ETL data processing.
Using AWS Glue involves significant development efforts to write custom migration scripts to copy the database data into Redshift.
Use AWS EMR to replicate the data from the databases into Amazon Redshift – Amazon EMR is the industry-leading cloud big data platform for processing vast amounts of data using open source tools such as Apache Spark, Apache Hive, Apache HBase, Apache Flink, Apache Hudi, and Presto. With EMR you can run Petabyte-scale analysis at less than half of the cost of traditional on-premises solutions and over 3x faster than standard Apache Spark. For short-running jobs, you can spin up and spin down clusters and pay per second for the instances used. For long-running workloads, you can create highly available clusters that automatically scale to meet demand. Amazon EMR uses Hadoop, an open-source framework, to distribute your data and processing across a resizable cluster of Amazon EC2 instances.
Using EMR involves significant infrastructure management efforts to set up and maintain the EMR cluster. Additionally this option involves a major development effort to write custom migration jobs to copy the database data into Redshift.
Use Amazon Kinesis Data Streams to replicate the data from the databases into Amazon Redshift – Amazon Kinesis Data Streams (KDS) is a massively scalable and durable real-time data streaming service. KDS can continuously capture gigabytes of data per second from hundreds of thousands of sources such as website clickstreams, database event streams, financial transactions, social media feeds, IT logs, and location-tracking events.
However, the user is expected to manually provision an appropriate number of shards to process the expected volume of the incoming data stream. The throughput of an Amazon Kinesis data stream is designed to scale without limits via increasing the number of shards within a data stream. Therefore Kinesis Data Streams is not the right fit for this use-case.

Question 3:
A company wants to migrate its on-premises databases to AWS Cloud. The CTO at the company wants a solution that can handle complex database configurations such as secondary indexes, foreign keys, and stored procedures.
As a solutions architect, which of the following AWS services should be combined to handle this use-case? (Select two)
• AWS Snowball Edge
• AWS Database Migration Service
• AWS Schema Conversion Tool
• AWS Glue
• Basic Schema Copy
Answer: B & C
Explanation
Correct options:
AWS Schema Conversion Tool
AWS Database Migration Service
AWS Database Migration Service helps you migrate databases to AWS quickly and securely. The source database remains fully operational during the migration, minimizing downtime to applications that rely on the database. AWS Database Migration Service supports homogeneous migrations such as Oracle to Oracle, as well as heterogeneous migrations between different database platforms, such as Oracle or Microsoft SQL Server to Amazon Aurora.
Given the use-case where the CTO at the company wants to move away from license-based expensive, legacy commercial database solutions deployed at the on-premises data center to more efficient, open-source, and cost-effective options on AWS Cloud, this is an example of heterogeneous database migrations.
For such a scenario, the source and target databases engines are different, like in the case of Oracle to Amazon Aurora, Oracle to PostgreSQL, or Microsoft SQL Server to MySQL migrations. In this case, the schema structure, data types, and database code of source and target databases can be quite different, requiring a schema and code transformation before the data migration starts.
That makes heterogeneous migrations a two-step process. First use the AWS Schema Conversion Tool to convert the source schema and code to match that of the target database, and then use the AWS Database Migration Service to migrate data from the source database to the target database. All the required data type conversions will automatically be done by the AWS Database Migration Service during the migration. The source database can be located on your on-premises environment outside of AWS, running on an Amazon EC2 instance, or it can be an Amazon RDS database. The target can be a database in Amazon EC2 or Amazon RDS.
Heterogeneous Database Migrations via – https://aws.amazon.com/dms/
Incorrect options:
AWS Snowball Edge – Snowball Edge Storage Optimized is the optimal choice if you need to securely and quickly transfer dozens of terabytes to petabytes of data to AWS. It provides up to 80 TB of usable HDD storage, 40 vCPUs, 1 TB of SATA SSD storage, and up to 40 Gb network connectivity to address large scale data transfer and pre-processing use cases. As each Snowball Edge Storage Optimized device can handle 80TB of data, you can order 10 such devices to take care of the data transfer for all applications. The original Snowball devices were transitioned out of service and Snowball Edge Storage Optimized are now the primary devices used for data transfer. You may see the Snowball device on the exam, just remember that the original Snowball device had 80TB of storage space. AWS Snowball Edge cannot be used for database migrations.
AWS Glue – AWS Glue is a fully managed extract, transform, and load (ETL) service that makes it easy for customers to prepare and load their data for analytics. AWS Glue job is meant to be used for batch ETL data processing. Therefore, it cannot be used for database migrations.
Basic Schema Copy – To quickly migrate a database schema to your target instance you can rely on the Basic Schema Copy feature of AWS Database Migration Service. Basic Schema Copy will automatically create tables and primary keys in the target instance if the target does not already contain tables with the same names. Basic Schema Copy is great for doing a test migration, or when you are migrating databases heterogeneously e.g. Oracle to MySQL or SQL Server to Oracle. Basic Schema Copy will not migrate secondary indexes, foreign keys or stored procedures. When you need to use a more customizable schema migration process (e.g. when you are migrating your production database and need to move your stored procedures and secondary database objects), you must use the AWS Schema Conversion Tool.