Many MySQL databases are configured to accept connections from other servers on the local network and will reject connections from remote machines. Ordinarily, you could work around this by creating an SSH tunnel connection to login to the machine and then open a database connection and map the database port to the one on your local machine. But how can you do that from a Python script?
Well, thanks to the Python packages
pymysql you can create an SSH connection inside your Python script and connect to a remote MySQL server securely and return your query results in a Pandas dataframe. Here’s how it’s done.
Open a Python document or Jupyter notebook and load the
sshtunnel packages, including the
SSHTunnelForwarder module from
sshtunnel. You can install these using
pip3 install package-name if they’re not already on your machine.
import pandas as pd import pymysql import logging import sshtunnel from sshtunnel import SSHTunnelForwarder
Next, create some configuration variables to hold your credentials. You’ll need the IP address of your SSH server, plus the username and password. You’ll also need the MySQL username, password, and database name.
ssh_host = '188.8.131.52' ssh_username = 'bob' ssh_password = 'SecretPassword' database_username = 'bob_smith' database_password = 'SecretDBPassword' database_name = 'ecommerce' localhost = '127.0.0.1'
To make the final code cleaner and allow us to re-use it, we’ll create some functions to handle each task. Our first function uses the
SSHTunnelForwarder module to create an SSH tunnel and port forwards it to a port on the local machine, then it starts the connection. You can use the
verbose=True flag to turn on debugging to check everything is working.
def open_ssh_tunnel(verbose=False): """Open an SSH tunnel and connect using a username and password. :param verbose: Set to True to show logging :return tunnel: Global SSH tunnel connection """ if verbose: sshtunnel.DEFAULT_LOGLEVEL = logging.DEBUG global tunnel tunnel = SSHTunnelForwarder( (ssh_host, 22), ssh_username = ssh_username, ssh_password = ssh_password, remote_bind_address = ('127.0.0.1', 3306) ) tunnel.start()
The next function uses PyMySQL to create a connection to the MySQL server. This will be run after the SSH tunnel has
been opened. The MySQL port will be mapped to the
local_bind_port from the
tunnel object and stored in the global
def mysql_connect(): """Connect to a MySQL server using the SSH tunnel connection :return connection: Global MySQL database connection """ global connection connection = pymysql.connect( host='127.0.0.1', user=database_username, passwd=database_password, db=database_name, port=tunnel.local_bind_port )
To run a database query, we can create a wrapper to the Pandas
read_sql_query() function. This takes a SQL query
and the global
connection and returns the MySQL result set in a Pandas dataframe.
def run_query(sql): """Runs a given SQL query via the global database connection. :param sql: MySQL query :return: Pandas dataframe containing results """ return pd.read_sql_query(sql, connection)
Finally, we need a couple of functions to tidy up when we’re done. The first one disconnects from the MySQL server by using the
close() function, while the second one closes the SSH tunnel.
def mysql_disconnect(): """Closes the MySQL database connection. """ connection.close()
def close_ssh_tunnel(): """Closes the SSH tunnel connection. """ tunnel.close
Finally, we can run the commands in the following order. First, we’ll connect to the remote server via SSH and create an SSH tunnel, then we’ll connect to the MySQL database on the remote server using the tunnel, then we’ll run the query and return a Pandas dataframe containing the result.
When that’s run, we’ll then disconnect from MySQL and close the SSH tunnel. You can also use similar code to insert data into a MySQL database using PyMySQL.
open_ssh_tunnel() mysql_connect() df = run_query("SELECT * FROM orders ORDER BY id DESC LIMIT 100") df.head()
Matt Clarke, Thursday, March 04, 2021