SQLite is a relational database management system (RDBMS) that is easy to access within Python and other languages. Unlike MySQL, PostgreSQL, and other databases, SQLite uses a serverless design, so doesn’t carry the administrative overhead of larger systems.
While it’s not designed for massive databases, it is ideal for smaller projects. It may lack the fuller features of the larger databases, but it’s ideal for many projects, and uses a simple syntax that is easy to learn. Here are the basics you need to know to get started with SQLite in Python.
For this project we’re going to use pandas
for displaying and manipulating data and the sqlite3
package. This is pre-installed in Python, so all you need to do is import the package.
import pandas as pd
import sqlite3
To create a connection to SQLite you use the connect()
function and pass in the name of the SQLite database you want to connect to. If the database doesn’t already exist, SQLite will create it for you. We’ll create a database called alerts.db
in which to store our tables.
connection = sqlite3.connect('alerts.db')
Next, we create a “cursor” connection to the SQLite database using cursor()
. We’ll use this connection in the subsequent functions.
cursor = connection.cursor()
To store our data we’ll need to create a table on the alerts.db
database. This can be created by constructing a CREATE TABLE
command and running it using cursor.execute()
. We’ll create three columns alert_date
which is set to a DATE
type, and alert_metric
and alert_segment
which are both going to hold TEXT
data. If you don’t specify an ID column, SQLite with add a rowid
column which increments automatically.
cursor.execute("""
CREATE TABLE IF NOT EXISTS logs(
alert_date DATE,
alert_metric TEXT,
alert_segment TEXT);
""")
connection.commit()
To insert data into SQLite we can create an INSERT
statement and then use cursor.execute()
and connection.commit()
to run it. Here’s a simple example. However, obviously we need something more dynamic that allows us to bind parameters into the query.
cursor.execute("""
INSERT INTO logs
(alert_date, alert_metric, alert_segment)
VALUES ('2021-01-05', 'sessions', 'Organic')
""")
connection.commit()
Instead of creating a custom INSERT
statement every time we want to add data to our table we can use bound parameters. Here add ?
characters to the VALUES
part of the query, and then pass in a list
containing the values in the correct order. By passing in the name of the list as the second argument on our execute()
function, SQLite will automatically replace each ?
with the corresponding value from the list.
log = ['2021-01-05', 'transactions', 'Social']
cursor.execute("""
INSERT INTO logs
(alert_date, alert_metric, alert_segment)
VALUES (?, ?, ?)
""", log)
connection.commit()
If you have multiple INSERT
statements to perform, they can all be added to a single list and inserted as a bulk insert by substituting the execute()
function for executemany()
. This is much quicker if you have numerous rows to insert.
logs = [
['2021-01-05', 'transactions', 'Social'],
['2021-01-05', 'sessions', 'Organic'],
['2021-01-05', 'bounceRate', 'Paid search']
]
cursor.executemany("""
INSERT INTO logs
(alert_date, alert_metric, alert_segment)
VALUES (?, ?, ?)
""", logs)
connection.commit()
SELECT
statements in SQLite are much the same as those in other languages, however, there are three different functions you can use for returning results. The fetchall()
function returns all of the results from a query, placing each row within a tuple, and wrapping everything in a list.
cursor.execute("""
SELECT
rowid,
alert_date,
alert_metric,
alert_segment
FROM logs
""")
results = cursor.fetchall()
results
[(1, '2021-01-05', 'sessions', 'Organic'),
(2, '2021-01-05', 'transactions', 'Social'),
(3, '2021-01-05', 'transactions', 'Social'),
(4, '2021-01-05', 'sessions', 'Organic'),
(5, '2021-01-05', 'bounceRate', 'Paid search'),
(6, '2021-01-05', 'sessions', 'Organic'),
(7, '2021-01-05', 'transactions', 'Social'),
(8, '2021-01-05', 'transactions', 'Social'),
(9, '2021-01-05', 'sessions', 'Organic'),
(10, '2021-01-05', 'bounceRate', 'Paid search'),
(11, '2021-01-05', 'sessions', 'Organic'),
(12, '2021-01-05', 'transactions', 'Social'),
(13, '2021-01-05', 'transactions', 'Social'),
(14, '2021-01-05', 'sessions', 'Organic'),
(15, '2021-01-05', 'bounceRate', 'Paid search'),
(16, '2021-01-05', 'sessions', 'Organic'),
(17, '2021-01-05', 'transactions', 'Social'),
(18, '2021-01-05', 'transactions', 'Social'),
(19, '2021-01-05', 'sessions', 'Organic'),
(20, '2021-01-05', 'bounceRate', 'Paid search'),
(21, '2021-01-05', 'sessions', 'Organic'),
(22, '2021-01-05', 'transactions', 'Social'),
(23, '2021-01-05', 'transactions', 'Social'),
(24, '2021-01-05', 'sessions', 'Organic'),
(25, '2021-01-05', 'bounceRate', 'Paid search'),
(26, '2021-01-05', 'sessions', 'Organic'),
(27, '2021-01-05', 'transactions', 'Social'),
(28, '2021-01-05', 'transactions', 'Social'),
(29, '2021-01-05', 'sessions', 'Organic'),
(30, '2021-01-05', 'bounceRate', 'Paid search'),
(31, '2021-01-05', 'sessions', 'Organic'),
(32, '2021-01-05', 'sessions', 'Organic'),
(33, '2021-01-05', 'transactions', 'Social'),
(34, '2021-01-05', 'transactions', 'Social'),
(35, '2021-01-05', 'sessions', 'Organic'),
(36, '2021-01-05', 'bounceRate', 'Paid search')]
The fetchone()
function fetches the most recent row for the query.
cursor.execute("""
SELECT
rowid,
alert_date,
alert_metric,
alert_segment
FROM logs
""")
results = cursor.fetchone()
results
(1, '2021-01-05', 'sessions', 'Organic')
The fetchmany()
function returns a number of rows, based on the integer argument passed in to the function.
cursor.execute("""
SELECT
rowid,
alert_date,
alert_metric,
alert_segment
FROM logs
""")
results = cursor.fetchmany(3)
results
[(1, '2021-01-05', 'sessions', 'Organic'),
(2, '2021-01-05', 'transactions', 'Social'),
(3, '2021-01-05', 'transactions', 'Social')]
One thing you’re likely to want to do when working with SQLite is save the output of your query to a Pandas dataframe. There are a couple of ways to do this. You can use SQLAlchemy, or you can just use regular SQLite and the Pandas from_records()
function. First, use execute()
to run your SELECT
statement, then use [column[0] for column in query.description]
to return a list of all of the column names used in the query.
query = cursor.execute("""
SELECT
rowid,
alert_date,
alert_metric,
alert_segment
FROM logs
""")
columns = [column[0] for column in query.description]
columns
['rowid', 'alert_date', 'alert_metric', 'alert_segment']
Finally, you can create a Pandas dataframe using pd.DataFrame
and use from_records()
to take the contents of your list and insert it into the dataframe. The first argument, data
, can take the query results from fetchall()
(or another of the fetching functions), and the columns
argument takes the list containing the column names.
df = pd.DataFrame.from_records(data=query.fetchall(), columns=columns)
df.head()
rowid | alert_date | alert_metric | alert_segment | |
---|---|---|---|---|
0 | 1 | 2021-01-05 | sessions | Organic |
1 | 2 | 2021-01-05 | transactions | Social |
2 | 3 | 2021-01-05 | transactions | Social |
3 | 4 | 2021-01-05 | sessions | Organic |
4 | 5 | 2021-01-05 | bounceRate | Paid search |
The process for updating records is much the same. We create our SQL statement, then use execute()
to execute it, and then commit()
to finalise things. Here, we’re re-setting the alert_segment
parameter to Direct
based on the rowid = 1
.
cursor.execute("""
UPDATE logs
SET alert_segment = 'Direct'
WHERE rowid = 1
""")
connection.commit()
cursor.execute("""
SELECT
rowid,
alert_date,
alert_metric,
alert_segment
FROM logs WHERE rowid = 1
""")
results = cursor.fetchone()
results
(1, '2021-01-05', 'sessions', 'Direct')
To delete a record from a SQLite table you construct a DELETE
statement in the usual way and then run it through execute()
and then use commit()
.
cursor.execute("""
DELETE FROM logs
WHERE rowid = 1
""")
connection.commit()
To truncate a table in SQLite, and delete its entire contents, you use DELETE FROM tablename
syntax, rather than the TRUNCATE TABLE tablename
approached used in other SQL databases. Simply issue this command with execute()
and then commit()
and the contents will be irreversibly deleted.
cursor.execute("""
DELETE FROM logs;
""")
connection.commit()
Matt Clarke, Monday, March 08, 2021