How to import a MySQL database

Learn how to import a MySQL database onto a MySQL Docker container and query the database using Python and PyMySQL.

How to import a MySQL database
Picture by Brett Sayles, Pexels.
6 minutes to read

SQL is one of the most widely used languages in data science so it’s important to know at least the basics required in order to fetch the data you’ll need to use in analyses or models.

In this example, I’ll show you how you can import a MySQL database into a MySQL Docker container and create a connection that allows you to start learning the SQL queries you’ll need to know for basic data science projects.

Login to MySQL

For this tutorial, I’ll assume you already have MySQL installed and setup. If you’ve not done this yet, please follow my simple step-by-step guide to getting MySQL up and running on Docker in under five minutes. It’s really easy to follow.

Use the docker exec -it docker_mysql bash command to open up the Bash shell on your Docker machine in your terminal, where docker_mysql is the name you gave your container. Once you’ve got your shell, enter mysql -uroot -p and then login to MySQL using the password you created when you set it up.

docker exec -it docker_mysql bash
mysql -uroot -p

Create an empty MySQL database

Next, in your MySQL terminal enter CREATE DATABASE ecommerce; and hit enter. This will create a new MySQL database called ecommerce into which we can now import our database. If the command works, you’ll get a message back saying Query OK, 1 row affected (0.00 sec), which means you now have an empty database.

CREATE DATABASE ecommerce;

Download a database import file

You can create your own database tables if you wish, but a more common approach is to import an existing SQL database from an SQL database dump file. I’ve created a simple SQL database from the Online Retail Dataset that is ideal for learning SQL queries, so we’ll import that next.

First, download the file from my GitHub repository, then unzip it. You’ll find inside a file called ecommerce.sql that includes the required queries to create the SQL database tables and insert some test data. Use the cd command to change the directory to Downloads (or wherever you saved the file to) and use ls to list the files to check that ecommerce.sql is present.

cd Downloads
ls

Import the SQL dump into your MySQL database

Next we’ll use the docker exec command to run a command from our terminal on the docker_mysql container. We’ll login to docker_mysql, then login to mysql with our password, tell it to use the ecommerce database and then tell it to import the ecommerce.sql file. After a few seconds of crunching the data, you should see a message telling you the import was successful.

docker exec -i docker_mysql mysql -uroot -pPASSWORD ecommerce < ecommerce.sql

Access the new database

To access the database we need to know it’s IP address so we can configure a database connection. Enter the command docker inspect docker_mysql in your terminal and Docker will return a huge amount of variables on your Docker container and environment. The one you need to find is called IPAddress and is located near the bottom of the file. My container has the IP address 172.17.0.3. Find yours and make a note of it.

docker inspect docker_mysql                    "Gateway": "172.17.0.1",
                    "IPAddress": "172.17.0.3",
                    "IPPrefixLen": 16,
                    "IPv6Gateway": "",
                    "GlobalIPv6Address": "",
                    "GlobalIPv6PrefixLen": 0,
                    "MacAddress": "02:42:ac:11:00:03",
                    "DriverOpts": null

Open a Jupyter notebook

Finally, open a Jupyter notebook and install the sqlalchemy and pymysql packages using the Pip package manager Then import pandas and the create_engine module from sqlalchemy. These will be used to connect to the MySQL server running on your Docker container.

!pip3 install sqlalchemy
!pip3 install pymysql
import pandas as pd
from sqlalchemy import create_engine

Next we’ll create a connection to the MySQL Docker using create_engine(). We’ll need to tell it that the database is mysql and we want to use pymysql for the connection, then we need to provide the username and password as root:PASSWORD where PASSWORD is the password you created. That’s followed by the IP address we obtained above, the standard port 3306 MySQL uses, and finally the name of the database ecommerce.

engine = create_engine('mysql+pymysql://root:********@172.17.0.3:3306/ecommerce')

You can then use the Pandas read_sql() function to run an SQL query on the database using the connection above that we’ve stored in engine.

df = pd.read_sql("SELECT * FROM orders", con=engine)
df.head()
invoice revenue skus items
0 536365 139.12 7 40
1 536366 22.20 2 12
2 536367 278.73 12 83
3 536368 70.05 4 15
4 536369 17.85 1 3

Matt Clarke, Tuesday, April 12, 2022

Matt Clarke Matt is an Ecommerce and Marketing Director who uses data science to help in his work. Matt has a Master's degree in Internet Retailing (plus two other Master's degrees in different fields) and specialises in the technical side of ecommerce and marketing.