From MySQL to MySQL(RDS) | Using AWS DMS to migrate your databases to AWS via Terraform

Moosa Khalid
13 min readOct 7, 2021

TL;DR: GitHub repository for Migrating MySQL to RDS using Terraform via AWS DMS.

Quick note(important)

Please note, the successful deployment of Terraform code in this tutorial only sets up the infrastructure and DMS components to enable successful migration and does NOT actually kick off the migration task, this is so that the user can invoke it themselves via console and see the details of the replication task in person, which is done intentionally as an exercise for the user. Look at the end of tutorial for how to kick off replication (which is just logging into console and clicking on a button and seeing everything in action). Don’t forget to destroy all the infrastructure that you’ll spin up in this tutorial lest you want to be surprise billed by AWS…..Let’s go!

Did someone order a Database migration?

So everybody loves using the word “migration”. Migrate this, migrate that. Oh just dump it out and migrate it over to that <insert-cool-new-cloud-acronym>. Is it that simple though? Well let’s find out.

If you work with databases, the act of moving those 1’s and 0’s near and dear to your organization, stored inside a database server such as MySQL, is inevitable part of work. Whether you’re replicating or simply moving it to a cloud provider it’s essential that you do it in a secure way while maintaining the integrity and continuity of your application.

Pre-requisites

This blog post will take you through setting up and using AWS Database Migration Services, otherwise known as DMS to migrate a VM hosted MySQL database to Amazon RDS. In this case the VM would be an EC2 instance.

What we will not be discussing or going into details of:
1. MySQL commands

2. MySQL User access and permissions.

Unfortunately I cannot and will not go into explaining the MySQL prompt commands used in this code, the reason being that it can open a whole new can of worms and distract from the task at hand. I will however explain briefly what these commands do.

You are expected to have associate to intermediate level knowledge of AWS and Terraform, including but not limited to:

1. AWS IAM (Roles, Policies)

2. Familiarity with MySQL(RDBMS)

3. Terraform know-how i.e. init, validate, plan, apply & destroy

4. Patience and faith to see it all through.

General Overview (Diagram)

Database migration process overview ( via AWS DMS)

The diagram above may seem a little confusing at first but as you start understanding it from the inside out i.e. from the middle, which is where the heart of the process lies, things are going to make a little more sense.

Amazon’s Database migration service depends on various components which work in tandem to migrate from a source database to a target database.

So we create two endpoints, namely source and target which basically contain the connectivity details and parameters for the DMS Replication Instance to connect to the required databases. Through these endpoints we define username, passwords and IP/hostname for database connectivity. The replication instance then uses these endpoints to make the connections to both source and target databases.

The DMS Replication task (which we’ll see in a bit), just like its name implies is the actual trigger defining WHAT needs to be migrated, for example it may define details of which database schema and what tables to migrate from the source DB to the target DB. Haven’t I said source and target enough times already!!! Well there’s more where this came from.

The DMS IAM roles created during this tutorial are essential to the whole process. These roles allow the DMS service to create the required components such as network interfaces to make connections to databases and also assume IAM role via STS (Security Token Service — a service which issues temporary token for AWS resource access). We won’t be discussing or going into details of how STS works.

Without the role and IAM permissions allowed through them, DMS would not be able to work at all!

What WE are going to do (This is what we’ll be deploying)

Our specific scenario — MySQL — MySQL-RDS migration using Terraform & AWS DMS

You’ll notice that the only thing we’re doing different is that we’re simulating our Private data center source MySQL server instance to be inside AWS itself. This makes it easy to demonstrate the concept and not having to setup VPNs, port-forwarding or any of those other networking shenanigans, we’ll leave that to the infrastructure networking teams in real life. We just want to focus on migration using DMS, for all intents and purposes our source MySQL could be anywhere as long as we can allow access to it. In our case, we’re attaching a Public IP to the MySQL instance running on EC2, within AWS.

Terraform & Database Migration Service (AWS)

So here’s the GitHub Repo which contains all the Terraform HCL code for this scenario:

Now let’s break it down for everyone!

The Database to be migrated

We’ll be bootstrapping the following data onto the EC2 instance that we spin up as part of our Terraform code:
1. MySQL Server ( mariadb — The underlying OS will be Amazon Linux 2)

2. A sample database from official MySQL website, this database will be downloaded and ingested into the MySQL(mariadb) server as well(here’s a link to the test db that we’ll be downloading, in case you want to take a look at it):

https://downloads.mysql.com/docs/sakila-db.tar.gz

3. We’ll be creating appropriate MySQL user on the source EC2 MySQL DB(called “dms”) which we’ll be passing to source endpoint to make connection to the EC2 MySQL.

Here’s the EC2 bootstrap script(ec2_mysql_bootstrap):

#!/bin/bash
yum -y install mariadb-server
sleep 2systemctl start mariadbsleep 2mysqladmin -uroot password '${password}'mysql -uroot -p${password} -e "CREATE USER 'dms'@'%' IDENTIFIED BY '${password}';"mysql -uroot -p${password} -e "GRANT ALL PRIVILEGES ON *.* TO 'dms'@'%' WITH GRANT OPTION;FLUSH PRIVILEGES;"wget -c https://downloads.mysql.com/docs/sakila-db.tar.gztar -xvzf sakila-db.tar.gz -C /home/ec2-user/mysql -uroot -p${password} -e "source /home/ec2-user/sakila-db/sakila-schema.sql;"mysql -uroot -p${password} -e "source /home/ec2-user/sakila-db/sakila-data.sql;"

It’s basically a bash script with a sequential set of commands.Without going into too much details it:

  1. Installs MySQL (mariadb-server)
  2. Starts up the MySQL server, sets the root MySQL user password
  3. Creates a user MySQL user ‘dms’ and allows it privileges on all databases and their tables within the MySQL server. (Note: This is absolutely NOT a secure thing to do, DO NOT DO THIS IN PRODUCTION, this is only because we’re simulating a test on a sample public DB.)
  4. Download tarball of test sakila DB from MySQL website.
  5. Ingests sakila DB schema and data into MySQL server.

And just like that we’re done with creating a source MySQL database server which will act as our source database.

Now let’s move towards the actual Infrastructure as Code side of things which will spin up literally everything, including the EC2 for source MySQL. I’ll be explaining the important components being spun up by each file in the GitHub Project.

main.tf

In this template I’m basically doing the following:

1. Declaring terraform specific settings like pegging a certain version and also defining the AWS provider.

2. Fetching details of already existing default VPC, subnet as well as fetching AMI-ID of Amazon Linux 2 using public SSM Parameter endpoint.(Quite a cool way to fetch AMI’s. (check it’s tutorial out here https://aws.amazon.com/blogs/compute/query-for-the-latest-amazon-linux-ami-ids-using-aws-systems-manager-parameter-store/)

3. Creating Security Groups and its rules for access to EC2, EC2 instance which is bootstrapped with MySQL(script shown earlier in blog) and an RDS DB MySQL instance.

You’ll also notice that I’ve used the Terraform “template_file” data resource to embed the password for MySQL EC2 instance bootstrap script, pretty cool templating feature in Terraform.(Read more about it here: https://registry.terraform.io/providers/hashicorp/template/latest/docs/data-sources/file)

dms.tf

In this file, we’re creating the components of DMS service, namely:

  • Source endpoint (defines connectivity to source DB)
  • Target endpoint (defines connectivity to target DB)
  • DMS replication instance (The instance to which source and target endpoints are attached to)

Connectivity above means port, host name or IP address , username and password for the database in question.

Let’s look at this snippet of code for aws_dms_replication_task.dms-task Terraform resource(removing escape backslashes, that you may notice in original code ,for brevity):

table_mappings = "{
"rules":[
{
"rule-type": "selection",
"rule-id": "1 ",
"rule-name":"1",
"object-locator":
{
"schema-name":"sakila",
"table-name": "%"
},
"rule-action": "include"
}
]
}"

The DMS replication task defines table mapping for migration, controlling which database schemas and tables one wants to migrate from source DB to the target DB. Notice how we’ve provided it with our DB/Schema name of sakila and the “%” here means that ALL tables under the sakila DB will be migrated. This way you can choose to move certain tables and not others.

roles.tf & dms_policy.json

The DMS service requires the permissions to be able to carry out some actions in the backend as it works i.e. being able to create network interfaces(a.k.a Elastic Network Interfaces) inside VPC’s for communication. For logging purposes we also allow DMS service to be able to send logs to CloudWatch service.

To be able to carry out these actions, DMS service assumes temporary credentials via the STS service(a.k.a Security Token Service).

The dms_policy.json file contains an IAM policy, which we attach to the IAM role that we create for the DMS service to assume via STS in the roles.tf file. The IAM policy has granular permissions that DMS needs. Without the necessary permissions DMS service either will not work or fail to work properly.

Note: There’s a glitch I found in Terraform’s handling of some resources DMS creates indirectly as well as some timing issues during creation of IAM role for DMS. I have documented the bug and workaround in roles.tf if you want to give it a further read, however I won’t discuss it here. Hopefully it’ll be fixed soon.

variables.tf

The variables.tf file parameterizes some inputs to make the Terraform code more flexible. Variables declared are:

  1. aws_profile — — -> The profile to use for connecting to AWS.
  2. password — — — -> Password to set for databses
  3. external_ip — — — > The IP to allow through to the EC2 instance SG.
  4. private-key-file — -> SSH private key to add to EC2 for SSH login.

The “external_ip” , in variables.tf does not have a default value set, which means you’ll either need to set one or pass a value via the CLI invocation of Terraform. This is for secure , restricted access to the EC2 spun up. You’ll notice a regex in the condition stopping you from simply putting in “0.0.0.0/0” in attempt to open up access to whole internet. To find your external IP use Google or a command such as “curl ifconfig.me”. Remember to add “/32” (without double quotes) to end of your external IP, otherwise you won’t be able to log into the EC2.

The “private-key-file” is the SSH private key file passed to EC2 upon it’s creation from your system for passwordless SSH into the EC2. It does have a default value which expects that the SSH private key file exists in your users home directory under “.ssh” directory. If it does exist there, all’s well, however if it’s not there you can run command “ssh-keygen -t rsa” and hit enter through all interactive prompts for your system to generate the SSH private key file in the default location that code expects. Either way, without this SSH private key Terraform run will fail.

output.tf

The output.tf file is for outputting useful information back to the user after Terraform apply completes running. Think of it as the return statement in a coding language. The file uses output block in Terraform to return specific attributes attached to Terraform resources created via the Terraform run. Information such as the source MySQL’s(running on EC2) IP and RDS instance’s hostname for connectivity along with the usernames for both databases is returned.

Run the code

Quick instructions and requirements for running this code:

  • Terraform 0.13 or above needs to be installed
  • AWS CLI to be configured (install via pip if you need to)
  • Code expects a default VPC with at least 1 subnet to exist!
  • Appropriate access to all IAM permissions required by this deployment such as permissions to create EC2 instance, IAM role creation, RDS database creation, Security groups etc.

Instruction:

  1. Clone repo

git clone https://github.com/moosakhalid/aws .git

2. Change into the directory where our code resides(run this command in same dir where you ran the git clone command).

cd aws/terraform_aws_dms

3. Optional : Generate SSH private key if it doesn’t exist, hit enter through all interactive prompts.

ssh-keygen -t rsa

4. terraform init #initialize Terraform, download required providers

5. terraform validate #to validate code for syntax and required files

6. terraform plan -var external_ip=”<your-ip>/32” #see what Terraform “plans” to deploy, type “yes” and hit enter to proceed.

7. terraform apply -var external_ip=”<your-ip>/32” #execute/create infrastructure, type “yes” and hit enter to proceed.

You can add an “--auto-approve” flag at the end of the 6th & 7th terraform commands to stop it from prompting you to enter an explicit “yes” before actually deploying the infrastructure.

Hopefully you’ve breezed through the deployment but the actual migration task hasn’t yet migrated anything. It has not yet been kicked off, we’ve just set up all the infrastructure(to simulate the migration) and the DMS service.

Although there’s ways to tell DMS to start/kick off migration via API calls but this part I wanted to have users do via AWS console as it offers a better view to help you see and experience migration in a better user friendly way. Seeing is believing, right? (Pssst, it is).

In some cases, it may take well over 10 minutes for Terraform apply to complete, spinning up EC2’s and RDS instances are time consuming tasks!

So after your Terraform code is done running , and before you kick off the replication task via console, you can optionally check by logging in to both databases i.e. EC2 MySQL and RDS what contents they have.
After your Terraform apply is successful you’ll get output similar to this. Please note your “Source-MySQL-IP” and “RDS-Endpoint-HostnameWILL BE DIFFERENT, so don’t try connecting to these values shown in screenshot.

For logging into EC2 MySQL database, log into the EC2 instance and then the MySQL database running on it and ensure that the “sakila” database exists on it:

From your system, where you ran the Terraform commands, run:

ssh ec2-user@<IP-of-EC2-MySQL>

mysql -uroot -p #Command will prompt for password(look at variables.tf for passwords variables default value)

MariaDB [(none)]> show databases;

Output of “show databases” should show a “sakila” db, which is the database we’re trying to migrate from this source DB.

Now quit out of the database prompt, but hitting Ctrl + D or typing “quit”.

Now staying logged into the same EC2, run the following command to log into the MySQL prompt of the RDS database:

mysql -uadmin -p -h<RDS-Hostname-returned-by-Terraform> #you’ll be prompted for password, enter the one set as default for password variable in variables.tf its the same for both DB’s.

show databases;

Since we haven’t kicked off the DMS replication task, you won’t see the “sakila” database in the RDS yet.

Now, log into your AWS console and head over to the “Database Migration Service” console page. From the left options pane click on “Database migration tasks”:

You’ll see the replication task “replication-task-dms” in “Ready” state. If it’s not in “Ready” state but “Creating”, just wait for a bit.

Select the “replication-task-dms” task and click on the “Actions” dropdown button on the top right and then click on “Resume/Restart” to start the actual replication(copying over of data).

If all goes well, the task will go into “Creating” to “Running” and finally “Load complete”. “Load complete” means that data replication between databases is complete.

Click on the hyperlink “replication-task-dms” to head over and look at the details of the task. On the details page for the task, click on “Table Statistics” to see tables migrated along with number of rows and other metadata for the replication.

You can also verify via CLI by logging back into the RDS database MySQL prompt using commands provided earlier, to ensure that now you can see the “sakila” database successfully migrated to RDS.

If everything worked out well, CONGRATS. This is not a straightforward task, IRL, migrating databases never is as straightforward. I hope this tutorial helped clarify how DMS service works.

But wait a moment, you’re NOT done, who’s going to delete all the resources you’ve spun up…. of course you ARE! So let’s go ahead and run the ‘terraform destroy” command and get it over with.

Run this command in the same directory where you ran the Terraform apply command:

terraform destroy -var external_ip=<your-external-ip/32>

Type “yes” when prompted and hit enter for destroy to proceed.

The External IP should be the same you used when running the plan and apply commands. To find it out, use command “curl ifconfig.me” or Google your external IP. :)

The destroy command can take over 10 minutes as well, so wait and let Terraform work its magic and hopefully you’ll see a successful destroy message like below.
Look at it and rejoice, there’s not many times in life you’ll be happy destroying things, unless you’re in certain professions or an avid user of Terraform!

And we are done. I’d sincerely like to thank you for sticking till the end, and if you were not able to make it to the end due to annoying errors, weird issues and things you couldn’t work around please feel free to reach out to me and I’ll try my best to help and improve this tutorial.

I can be reached out via LinkedIN: https://linkedin.com/in/moosa-khalid, this is where I’m most active and responsive, time permitting.

Happy Migrating on AWS folks, stay safe and healthy!

--

--

Moosa Khalid

Hi My name is Moosa. I love infrastructure as code and party parrot emojis. More often then not you'll find me breaking things before I can get them to work!