How to set up a Docker container for your MySQL server

Learn how to create a Docker container for your MySQL or MariaDB database server so you can extract and query data locally without impacting production.

How to set up a Docker container for your MySQL server
8 minutes to read

Like most people who work in ecommerce data science, I regularly need to access data stored in a database - usually MySQL or MariaDB, but sometimes also MSSQL. Although it is possible to connect to these database servers directly and run SQL queries in situ, this can be risky. Write your query in the wrong way and you could place the production server under severe load, potentially impacting customers.

The solution, of course, is to transfer this data to a data warehouse, data lake, or local server and examine it there instead. Docker makes it fairly straightforward to set up a MySQL or MariaDB environment on your local machine upon which you can install one or many databases.

In this tutorial, I’ll show you how to set up a Docker container for MySQL or MariaDB, how to import a database dump to create a local copy of your production database, and finally how to connect to the Docker MySQL server using a data manager like Beekeeper Studio or Data Grip, and how to analyse the data using Jupyter Notebooks and SQLAlchemy.

Find the right Docker container

MySQL comes in a few different flavours now. To minimise any potential compatibility issues, I am going to use the exact same version that’s running on the production server used to generate the MySQL dump. This is MariaDB version 10.1. If you visit Docker Hub, you’ll find pre-built Docker containers for the official version of MySQL and the MariaDB version. If you browse the tags section, you should be able to find a version number which matches the one used in your MySQL dump. By passing this tag value to the docker pull command, you can ensure you download the exact same version, or one very close to it.

Install MySQL in a Docker container

First we’ll use docker pull mariadb:10.1 to download and install the official MariaDB container from Docker Hub. This will install version 10.1 of the MariaDB version of MySQL and a compact Linux environment in a container on your machine. You’ll be able to use this for a single project or multiple projects, with everything neatly self-contained.

docker pull mariadb:10.1

Set up MySQL on the Docker container

Now that Docker is installed, we will run the container give it a name docker_mysql and set a root password so we can login. Finally, we define the MariaDB version using a tag.

docker run --name docker_mysql -e MYSQL_ROOT_PASSWORD=SecretPasswordHere -d mariadb:10.1

Login to your MySQL container

To access your Docker container using the Bash shell you can use docker exec, define the Docker container name and tell it to initialise Bash. This will give you a command prompt to issue commands on the Docker container itself. Once you have shell access you can then login to MySQL using the password you added in the previous step.

docker exec -it docker_mysql bash
mysql -uroot -p

Import a MySQL dump into your Docker container

Now we have a working database server we can install an existing MySQL database from a dump. If you’re not already logged in, repeat the steps above to open a Bash shell on the Docker container and then login to MySQL. Then create a new MySQL database with the name specified in your dump with create database your_db_name;. If you don’t know the database name, open a terminal and type head /path/to/dump.sql and you’ll see the database name in the code, without the need to load what might be a huge file.

docker exec -it docker_mysql bash
mysql -uroot -p
create database your_db_name;

Once you’ve created the database, you can import it using docker exec. This will take a while if you’ve taken a massive dump…

docker exec -i docker_mysql mysql -uroot -pSecretPasswordHere your_db_name < dump.sql

Accessing the database using an SQL manager

Most people who work with databases a lot prefer to use an SQL manager application for browsing, administering, and querying their database, rather than doing everything from the command line. I tend to use DBeaver on Linux most of the time, but Beekeeper Studio and DataGrip are also very good. All of them can be configured to access the Docker container running your MySQL server.

First, find the enter docker ps to view the running Docker containers and check the PORTS column of the output. By default, you’ll usually see 3306/tcp on your MySQL or MariaDB containers, as this is the default port.

docker ps
CONTAINER ID        IMAGE                         COMMAND                  CREATED             STATUS              PORTS                    NAMES
cfe480cf160a        mariadb:10.1                  "docker-entrypoint.s…"   8 minutes ago       Up 8 minutes        3306/tcp                 docker_mysql
01012c4f4759        jupyter/tensorflow-notebook   "tini -g -- start-no…"   About an hour ago   Up About an hour    0.0.0.0:8888->8888/tcp   pensive_grothendieck

Next, enter docker inspect docker_mysql and you’ll get back a JSON object containing all of the data on your MySQL container. Scroll through this and look for the IPAddress field. This is the IP of your Docker container. Mine was 172.17.0.3.

docker inspect docker_mysql

Finally, open up your preferred database manager - I’m using Beekeeper Studio, which is free and open source. Select the connection type as MariaDB (or MySQL if you installed that), enter the IP address in the Host field and ensure the Port value matches the one you saw above (usually 3306), then enter your username and password and test the connection. If you only have one database, you can set it as the default, or you can select it from the menu manually.

Connecting via a Jupyter notebook

One really neat trick you can use with MySQL and other databases is performing your Exploratory Data Analysis inside Jupyter. The SQLAlchemy package makes this very easy to do. Install SQLAlchemy if you don’t already have it, and the PyMySQL engine.

!pip3 install sqlalchemy
!pip3 install pymysql
Requirement already satisfied: sqlalchemy in /opt/conda/lib/python3.8/site-packages (1.3.19)
Requirement already satisfied: pymysql in /opt/conda/lib/python3.8/site-packages (0.10.0)

Next, load up Pandas and use the create_engine feature to connect to the MySQL database using a connection string like the one below. This passes in the username, password, IP address, port number and database name and opens a connection.

import pandas as pd
from sqlalchemy import create_engine
engine = create_engine('mysql+pymysql://root:SecretPasswordHere@172.17.0.3:3306/your_db_name')
query = """
SELECT * FROM orders
"""
df = pd.read_sql(query, con=engine)
df.head()

Matt Clarke, Wednesday, March 03, 2021

Matt Clarke Matt is an Ecommerce and Marketing Director who uses data science to help in his work. Matt has a Master's degree in Internet Retailing (plus two other Master's degrees in different fields) and specialises in the technical side of ecommerce and marketing.