How to use lists in Python

Picture by Katerina Holmes, Pexels.

9 minutes to read

Data Science Python

Lists are one of the most widely used data storage objects or data types within Python and are used throughout every data science package. Along with the dictionary, tuple, and set, the list allow you to store data and retrieve it quickly using a variety of different techniques. Here are the basics you need to know.

Creating a list

Creating a list in Python is very simple. You simply assign a variable, such as companies below, then assign a list of comma separated values within a pair of square brackets. String values need to be enclosed in double or single quotes.

companies = ['Hooli', 'Pied Piper', 'Aviato']
print(companies)

['Hooli', 'Pied Piper', 'Aviato']

Numeric values, such as int or float values, don’t require any quote encapsulation. You can include as many values as you like, and are free to repeat values and mix types if you wish.

numbers = [2, 6, 56, 4, 2.99, 53]
print(numbers)

[2, 6, 56, 4, 2, 53]

Adding an item to a list

To add an item to an existing list, you can add append() function to your list, i.e. companies.append() and then pass the list value you wish to add. companies.append('Sliceline') will add the string “Sliceline” to our companies list.

companies.append('Sliceline')
print(companies)

['Hooli', 'Pied Piper', 'Sliceline']

Removing an item from a list by name

The process for removing an item from a list is much the same. You just need to substitute the append() function for remove(), and the value you pass will be removed from the list.

companies.remove('Sliceline')
companies

['Hooli', 'Pied Piper', 'Aviato']

Removing a list of items from another list

If you want to remove a list of items from another list, you can create a for loop and iterate over it, then run remove() on each of the list elements. The little function below can then be used to remove the list ['Erlich', 'Jian Yang'] assigned to remove_list from the target_list.

def remove_from_list(target_list, remove_list):
    for item in remove_list:
        if item in target_list:
            target_list.remove(item)
    
    return target_list


target_list = ['Richard', 'Gilfoyle', 'Dinesh', 'Jared', 'Erlich', 'Jian Yang']
remove_list = ['Erlich', 'Jian Yang']
new_list = remove_from_list(target_list, remove_list)

['Richard', 'Gilfoyle', 'Dinesh', 'Jared']

Merging or concatenating lists

To merge or concatenate two or more lists together, you can simply use the + operator. Here we’ll create a new list called technologies and will create it by concatenating the languages and databases lists.

languages = ['Python', 'PHP']
databases = ['MySQL', 'PostgreSQL', 'SQL Server']
technologies = languages + databases
technologies

['Python', 'PHP', 'JavaScript', 'MySQL', 'PostgreSQL', 'SQL Server']

Extending a list

The other way to join lists together is to use the extend() function. Below we’re passing the visualisation_packages list as an argument to extend and using it to add the list onto the existing packages list.

packages = ['Pandas', 'NumPy']
visualisation_packages = ['Matplotlib', 'Seaborn']
packages.extend(visualisation_packages)
packages

['Pandas', 'NumPy', 'Matplotlib', 'Seaborn']

Iterating over a list

To iterate over a list, either to print the values, or perform some kind of function or calculation on the contents, you can use a for loop. Below we’re iterating over the technologies list with a for loop and then printing the value assigned to technology.

for technology in technologies: 
    print(technology)

Python
PHP
JavaScript
MySQL
PostgreSQL
SQL Server

Sorting a list

To sort a list you can use the sorted() function. By default, this places items in alphabetical order, or in ascending order of their size.

sorted_technologies = sorted(technologies)
sorted_technologies

['JavaScript', 'MySQL', 'PHP', 'PostgreSQL', 'Python', 'SQL Server']

Reversing a list

To reverse the order of a list you can use the reverse() function. Applying reverse() to our list containing ['A', 'B', 'C'] will flip the sort order so they become ['C', 'B', 'A'].

letters = ['A', 'B', 'C']
letters.reverse()
letters

['C', 'B', 'A']

Checking to see if an item is in a list or not

To check whether a given value is present in a list or not you can use an if statement with an in clause or a not in clause.

if 'Python' in technologies :
        print("Python found!")

Python found!

if 'Go' not in technologies :
        print("Go not found!")

Go not found!

Calculating the length of a list

To find the length of a list, or the total number of items present, you can use len() and pass the list variable as the argument.

names = ['Bob', 'Mike', 'Phil', 'Bob']
len(names)

Counting the number of times an item occurs in a list

To find the number of unique mentions for a given item in a list, you can use count() and pass the list item value as the argument. Below names.count('Bob') returns 2 because Bob is present twice.

names = ['Bob', 'Mike', 'Phil', 'Bob']
names.count('Bob')

Duplicating or copying a list

To copy or duplicate a list you can append the copy() function.

names_again = names.copy()
names_again

['Bob', 'Mike', 'Phil', 'Bob']

Clearing or emptying a list

To clear or empty a list you can append the clear() function. This then returns an empty list.

names_again.clear()
names_again

[]

Finding the index of a list item

Although you can’t see it when you view a list, each item is given a numeric index. As with other programming languages, indices in Python start at zero, so the first item in the list below “MySQL” occupies index 0 not index 1. You can return the index for a given list value using index().

databases = ['MySQL', 'PostgreSQL', 'SQL Server']
index = databases.index('PostgreSQL')
index

Accessing a list item via its index

Once you’ve got the index for a list item, you can then access it directly by passing the index value to the list name in square brackets. So databases[1] will return “PostgreSQL”, as this occupies index 1 in our list.

databases[1]

'PostgreSQL'

Matt Clarke, Monday, March 08, 2021

Matt Clarke Matt is an Ecommerce and Marketing Director who uses data science to help in his work. Matt has a Master's degree in Internet Retailing (plus two other Master's degrees in different fields) and specialises in the technical side of ecommerce and marketing.