How to identify and count unique values in Pandas

Learn how to use the Pandas unique() and nunique() methods to identify and count unique values in a Pandas DataFrame.

How to identify and count unique values in Pandas
Picture by Tim Gouw, Pexels.
4 minutes to read

When working with Pandas, you’ll often need to identify and count unique values in a DataFrame. This is a common task in data science, and Pandas provides two methods to help you do this: unique() and nunique(). In this quick tutorial, you’ll learn how to use these methods to identify and count unique values in a Pandas DataFrame.

Import the packages

To get started, open a Jupyter notebook and import the Pandas package.

import pandas as pd

Create a dataframe containing some duplicate values

Next, either import data into a Pandas dataframe containing the data you want to examine, or create a new dataframe containing some duplicate values.

data = [{'species': 'Esox lucius', 'length': 120, 'weight': 8.1, 'age': 3}, 
        {'species': 'Esox lucius', 'length': 100, 'weight': 7.7, 'age': 2},
        {'species': 'Esox lucius', 'length': 110, 'weight': 7.9, 'age': 2},
        {'species': 'Cyprinus carpio', 'length': 56, 'weight': 8.3, 'age': 13},
        {'species': 'Cyprinus carpio', 'length': 36, 'weight': 7.9, 'age': 23},
        {'species': 'Cyprinus carpio', 'length': 46, 'weight': 8.1, 'age': 13},
        {'species': 'Salmo trutta', 'length': 40, 'weight': 7.5, 'age': 5},
        {'species': 'Salmo trutta', 'length': 38, 'weight': 7.4, 'age': 4},
        {'species': 'Oncorhynchus mykiss', 'length': 42, 'weight': 7.6, 'age': 5},
        {'species': 'Salmo salar', 'length': 44, 'weight': 7.7, 'age': 5}]

df = pd.DataFrame(data)
df
species length weight age
0 Esox lucius 120 8.1 3
1 Esox lucius 100 7.7 2
2 Esox lucius 110 7.9 2
3 Cyprinus carpio 56 8.3 13
4 Cyprinus carpio 36 7.9 23
5 Cyprinus carpio 46 8.1 13
6 Salmo trutta 40 7.5 5
7 Salmo trutta 38 7.4 4
8 Oncorhynchus mykiss 42 7.6 5
9 Salmo salar 44 7.7 5

Select unique values from a specific column

To select the unique values from a specific column in a Pandas dataframe you can use the unique() method. This is simply appended to the end of the column name, e.g. df['column_name'].unique() and returns a Python list of the unique values.

# Select unique values from the species column
df['species'].unique()
array(['Esox lucius', 'Cyprinus carpio', 'Salmo trutta',
       'Oncorhynchus mykiss', 'Salmo salar'], dtype=object)

Count the number of unique values in a specific column

To count the number of unique values in a specific column in a Pandas dataframe you can use the nunique() method. As with the unique() method, this is simply appended to the end of the column name, e.g. df['column_name'].nunique() and returns an integer representing the number of unique values.

# Count the number of unique values in the species column
df['species'].nunique()
5

Matt Clarke, Saturday, November 12, 2022

Matt Clarke Matt is an Ecommerce and Marketing Director who uses data science to help in his work. Matt has a Master's degree in Internet Retailing (plus two other Master's degrees in different fields) and specialises in the technical side of ecommerce and marketing.