How to use ORDER BY to sort an SQL result set

The SQL ORDER BY clause is used to sort the results of a SELECT statement so you can place column values in ascending or descending order. Here's how it works.

How to use ORDER BY to sort an SQL result set
Picture by Pixabay, Pexels.
9 minutes to read

When you create an SQL SELECT statement, the data probably won’t be returned or sorted in the order you want, so the ORDER BY clause is used to control the sort order of the results returned by the query.

The ORDER BY clause is really easy to use and can allow you to sort an SQL resultset in ascending or descending order, using one or more columns. Here’s a quick guide to using it.

I’m connecting to a MySQL Docker container into which I’ve imported a MySQL SQL database dump based on the Online Retail Dataset. We’ll be using SQLAlchemy and PyMYSQL to connect to the database using Pandas.

Install and load the packages

Since we’re querying MySQL from within a Jupyter notebook, as many data scientists do in their daily work, you’ll first need to open Jupyter and install the required pymysql and sqlalchemy packages using the Pip Python package manager. Then, you’ll need to import pandas and the create_engine module from sqlalchemy.

!pip3 install pymysql
!pip3 install sqlalchemy
import pandas as pd
from sqlalchemy import create_engine

Create a database connection

Next we’ll construct a database connection string to pass to create_engine(). This tells SQLAlchemy we’re connecting to a MySQL database using the PyMYSQL driver, and are using the username root and the password ***** and are connecting to the MySQL server on port 3306 of IP address 172.17.0.3. The final argument tells create_engine() to connect to the ecommerce database.

engine = create_engine('mysql+pymysql://root:*****@172.17.0.3:3306/ecommerce')

SELECT FROM an SQL table

First we’ll create a simple unordered query using a regular SQL SELECT statement. This orders the data according to whatever index is present on the database, which might not match our preferred formatting.

df = pd.read_sql_query("SELECT * FROM order_items", con=engine)
df.head()
invoice sku description qty order_date unit_price customer_id country line_price
0 536365 85123A WHITE HANGING HEART T-LIGHT HOLDER 6 2010-12-01 08:26:00 2.55 17850.0 United Kingdom 15.30
1 536365 71053 WHITE METAL LANTERN 6 2010-12-01 08:26:00 3.39 17850.0 United Kingdom 20.34
2 536365 84406B CREAM CUPID HEARTS COAT HANGER 8 2010-12-01 08:26:00 2.75 17850.0 United Kingdom 22.00
3 536365 84029G KNITTED UNION FLAG HOT WATER BOTTLE 6 2010-12-01 08:26:00 3.39 17850.0 United Kingdom 20.34
4 536365 84029E RED WOOLLY HOTTIE WHITE HEART. 6 2010-12-01 08:26:00 3.39 17850.0 United Kingdom 20.34

Use ORDER BY to sort by a single column in descending order

To sort the SQL resultset we can append the ORDER BY clause to the end of the SELECT statement and define the column by which we want to sort and the sort order, which can either by DESC for descending order, or ASC for ascending order.

Our query SELECT * FROM order_items ORDER BY qty DESC will sort the resultset by the qty column in descending order, so the highest quantities are shown first and the values decrease down the list.

df = pd.read_sql_query("SELECT * FROM order_items ORDER BY qty DESC", con=engine)
df.head()
invoice sku description qty order_date unit_price customer_id country line_price
0 581483 23843 PAPER CRAFT , LITTLE BIRDIE 80995 2011-12-09 09:15:00 2.08 16446.0 United Kingdom 168469.6
1 541431 23166 MEDIUM CERAMIC TOP STORAGE JAR 74215 2011-01-18 10:01:00 1.04 12346.0 United Kingdom 77183.6
2 578841 84826 ASSTD DESIGN 3D PAPER STICKERS 12540 2011-11-25 15:57:00 0.00 13256.0 United Kingdom 0.0
3 542504 37413 None 5568 2011-01-28 12:03:00 0.00 NaN United Kingdom 0.0
4 573008 84077 WORLD WAR 2 GLIDERS ASSTD DESIGNS 4800 2011-10-27 12:26:00 0.21 12901.0 United Kingdom 1008.0

Use ORDER BY to sort by a single column in ascending order

To sort items in ascending order, you simply replace the DESC argument with ASC. Our query below will sort items from the lowest unit_price to the highest.

df = pd.read_sql_query("SELECT * FROM order_items ORDER BY unit_price ASC", con=engine)
df.head()
invoice sku description qty order_date unit_price customer_id country line_price
0 A563186 B Adjust bad debt 1 2011-08-12 14:51:00 -11062.06 NaN United Kingdom -11062.06
1 A563187 B Adjust bad debt 1 2011-08-12 14:52:00 -11062.06 NaN United Kingdom -11062.06
2 536414 22139 None 56 2010-12-01 11:52:00 0.00 NaN United Kingdom 0.00
3 536545 21134 None 1 2010-12-01 14:32:00 0.00 NaN United Kingdom 0.00
4 536546 22145 None 1 2010-12-01 14:33:00 0.00 NaN United Kingdom 0.00

Using ORDER BY to sort a column in alphabetical order

The ORDER BY column_name ASC statement also works for non-numeric or categorical data, such as strings, allowing you to put them in alphabetical order from A to Z, or vice versa if you use DESC.

df = pd.read_sql_query("SELECT * FROM order_items ORDER BY description ASC", con=engine)
df.head()
invoice sku description qty order_date unit_price customer_id country line_price
0 536414 22139 None 56 2010-12-01 11:52:00 0.0 NaN United Kingdom 0.0
1 536545 21134 None 1 2010-12-01 14:32:00 0.0 NaN United Kingdom 0.0
2 536546 22145 None 1 2010-12-01 14:33:00 0.0 NaN United Kingdom 0.0
3 536547 37509 None 1 2010-12-01 14:33:00 0.0 NaN United Kingdom 0.0
4 536549 85226A None 1 2010-12-01 14:34:00 0.0 NaN United Kingdom 0.0

Using ORDER BY to sort by multiple columns

Finally, you can also sort by multiple columns by passing them as a list of comma separated values. The below query will sort the resultset first in ascending order of the description name (i.e. in alphabetical order) and then in ascending order of the order_date, so the earliest purchases are first.

query = """
SELECT 
    * FROM order_items 
ORDER BY 
    description ASC,
    order_date ASC
"""
df = pd.read_sql_query(query, con=engine)
df.head()
invoice sku description qty order_date unit_price customer_id country line_price
0 536414 22139 None 56 2010-12-01 11:52:00 0.0 NaN United Kingdom 0.0
1 536545 21134 None 1 2010-12-01 14:32:00 0.0 NaN United Kingdom 0.0
2 536546 22145 None 1 2010-12-01 14:33:00 0.0 NaN United Kingdom 0.0
3 536547 37509 None 1 2010-12-01 14:33:00 0.0 NaN United Kingdom 0.0
4 536549 85226A None 1 2010-12-01 14:34:00 0.0 NaN United Kingdom 0.0

Matt Clarke, Thursday, April 14, 2022

Matt Clarke Matt is an Ecommerce and Marketing Director who uses data science to help in his work. Matt has a Master's degree in Internet Retailing (plus two other Master's degrees in different fields) and specialises in the technical side of ecommerce and marketing.