How to perform facial recognition in Python

Facial recognition is now very effective and has become part of everyday life. Here's how to use Python to perform facial detection and facial recognition.

How to perform facial recognition in Python
Images thanks to Chuttersnap, Christian Buehener, Bruce Mars, Reynardo Etenia Wongso, and Christian Campbell via Unsplash.
23 minutes to read

Facial recognition algorithms have made giant steps in the past decade and have become commonplace in everything from social networks and mobile phone camera software, to surveillance systems. They make it possible to not just detect where faces exist within images or video footage, but also, when trained, who the faces belong to.

Dlib is one of the leading software systems for facial recognition. Written in C++, this computer vision library is open source and is based on a pre-trained ResNet model which has a 99.38% accuracy score on the Labelled Faces in the Wild (or LFW) face recognition benchmark, making it pretty much state-of-the-art. Thanks to developer Adam Geitgey, the Dlib model is also available for use within Python via the excellent Face Recognition package.

Face Recognition handles both face detection and facial recognition, but also facial feature detection and it can be used on both images and video.

Install the required software

For this project you’ll need to install the face_recognition and Pillow packages from the Python Package Index, PyPi. To get the installation of face_recognition to work, you’ll first need to ensure that the CMake compiler is installed on your Linux machine.

You can do this in Ubuntu by entering sudo apt install cmake -y. Once that’s installed, you can then install face_recognition and Pillow using pip and it will set everything up and ensure all of the dependencies are present on your system.

sudo apt install cmake -ypip3 install face_recognition
pip3 install Pillow

Load the libraries

The face_recognition package can not only detect faces within images, it can also be used to extract them, draw boxes around them, and recognise known faces once it’s been trained. For our first test, we’ll import the Image package from PIL and then import the face_recognition Python package.

from PIL import Image
import face_recognition

Load an image

Next, find an image containing one or more faces. You can use the Pillow Image.open() function to load this and then display the output by passing the returned variable to the display() function. Here’s our test image.

woman = Image.open("woman.jpg")
display(woman)

png

Next, we’ll use the face_recognition package’s load_image_file() function to load an image containing a face and get the model identify its location within the image.

image_array = face_recognition.load_image_file("woman.jpg")

Since such models require numerical data, the load_image_file() function converts the image to a NumPy array comprising a matrix of all the pixels within the image. If you print the image variable you can see the numeric representation of the image.

image_array
array([[[187, 219, 242],
        [185, 220, 242],
        [185, 220, 242],
        ...,
        [177, 215, 236],
        [177, 215, 236],
        [177, 215, 236]],

       [[187, 219, 242],
        [185, 220, 242],
        [185, 220, 242],
        ...,
        [178, 216, 237],
        [178, 216, 237],
        [178, 216, 237]],

       [[187, 219, 242],
        [185, 220, 242],
        [185, 220, 242],
        ...,
        [178, 216, 237],
        [178, 216, 237],
        [178, 216, 237]],

       ...,

       [[102, 121, 153],
        [102, 121, 153],
        [100, 121, 152],
        ...,
        [ 89, 109, 142],
        [ 89, 109, 142],
        [ 89, 109, 142]],

       [[102, 121, 153],
        [101, 120, 152],
        [100, 121, 152],
        ...,
        [ 88, 108, 141],
        [ 88, 108, 141],
        [ 88, 108, 141]],

       [[101, 120, 152],
        [101, 120, 152],
        [100, 121, 152],
        ...,
        [ 88, 108, 141],
        [ 88, 108, 141],
        [ 88, 108, 141]]], dtype=uint8)

Identify the face locations using face detection

The face_recognition package includes a number of different models that can be used for face detection and facial recognition. The face_recognition.face_locations() method takes the NumPy array of the image from load_image() and runs it through a “HOG” based model, which uses the Histogram of Oriented gradients approach. This returns a list containing the coordinates of the faces it detects within the image. The numbers map to the top, right, bottom, and left of the image.

face_locations = face_recognition.face_locations(image_array)
face_locations
[(277, 847, 598, 526)]

Extract the faces from the image

Since we can now identify the coordinates of any faces within the images the model has identified using face detection, we can pass these values to Pillow and crop the faces out of the main image. First we extract the top, right, bottom, and left coordinates from the face_location list, then we pass the coordinates to Image.fromarray() and use the display() function to show the faces found.

for face_location in face_locations:
    top, right, bottom, left = face_location

    coordinates = image_array[top:bottom, left:right]
    face = Image.fromarray(coordinates)
    display(face)

png

Draw boxes around faces

Rather than cropping the faces out of the images, it might be preferable to draw a box on the image showing any faces that the model has recognised. Let’s load up a new image and see if the model can detect all of the faces shown.

women = Image.open("three-women.jpg")
display(women)

png

To detect the faces we’ll need to load up the ImageDraw package from Pillow, and then use load_image_file() to load a new image containing a couple and extract the coordinates of the faces using face_locations, just as we did in the previous step. If you print the face_locations variable, you’ll see that it contains the coordinates of three faces in the Python list.

import face_recognition
from PIL import Image, ImageDraw

image_array = face_recognition.load_image_file("three-women.jpg")
face_locations = face_recognition.face_locations(image_array)

face_locations
[(340, 755, 469, 626), (343, 521, 450, 414), (343, 892, 450, 784)]

As we want to draw on top of the image, we need to change it from its current NumPy array format back into an image, so we pass the image_array back into the Image.fromarray() function, then use Image.Draw() to turn it into an image object.

image = Image.fromarray(image_array)
draw = ImageDraw.Draw(image)

Finally, we can pass the coordinates to draw.rectangle() along with the colour code for green (0, 255, 0) and a width of 5 pixels and then display the image with the faces appearing inside blue boxes.

for face_location in face_locations:
    top, right, bottom, left = face_location
    draw.rectangle(
        ((left, top), (right, bottom)), 
        outline=(50, 181, 201),
        width=5,
    )

display(image)

png

Identifying specific people with facial recognition

Next, we’ll use face_recognition not just to identify the positions of faces but also the people shown in the image. For this, we need to help the model understand the encodings of specific, labeled faces. Once we’ve done this, we can then draw on the boxes around the faces and add a label to any faces the model recognises. To do this, you’ll need to find some images that contain the people you want the model to recognise. You can take a look at the training images below.

from PIL import Image

face1 = Image.open("woman-1.jpg")
face1.thumbnail((500,500))
face1

png

face2 = Image.open("woman-2.jpg")
face2.thumbnail((500,500))
face2

png

face3 = Image.open("woman-3.jpg")
face3.thumbnail((500,500))
face3

png

Use load_image_file() to load each image, then pass the resulting NumPy array to the face_recognition.face_encodings() function and extract the element in index [0]. Create a Python list in which to store all of the face encodings, then create another list in which to store the names of the people shown.

import numpy as np
import face_recognition
from PIL import Image, ImageDraw, ImageFont

face1 = face_recognition.load_image_file("woman-1.jpg")
face1_encoding = face_recognition.face_encodings(face1)[0]

face2 = face_recognition.load_image_file("woman-1.jpg")
face2_encoding = face_recognition.face_encodings(face2)[0]

face3 = face_recognition.load_image_file("woman-3.jpg")
face3_encoding = face_recognition.face_encodings(face3)[0]

face_encodings = [face1_encoding, face2_encoding, face3_encoding]
face_names = ['Rebecca', 'Natasha', 'Sarah']

Now that you’ve told the model the names of the people and provided the encodings to help recognise their faces, you can load your image containing the people. Just as before, we’ll use face_locations() to identify the positions of the faces, then we’ll pass the face_locations list of coordinates into face_encodings() along with the NumPy array of the picture containing the three women. Then we’ll use fromarray() to turn the Numpy array back into an image and load the image using the Pillow Draw() function.

image_array = face_recognition.load_image_file("three-women-cropped.jpg")
face_locations = face_recognition.face_locations(image_array)
face_encodings = face_recognition.face_encodings(image_array, face_locations)

image = Image.fromarray(image_array)
draw = ImageDraw.Draw(image)

We can now loop through the lists of face locations and face encoding data and use the compare_faces() function to see if we get any matches back using matches = face_recognition.compare_faces(face_encodings, face_encoding). compare_faces() examines the encodings of known faces to the ones in the encoding you’re checking. The matches variable will contain a Python list for each of the faces it’s trying to find in the image. As you can see below, it gets a match on woman1, then woman2, then woman3. Obviously, if you have provided a massive number of faces to detect, this might take some time, but it’s very quick on small datasets.

for (top, right, bottom, left), face_encoding in zip(face_locations, face_encodings):

    matches = face_recognition.compare_faces(face_encodings, face_encoding)

    print(matches)
[True, False, False]
[False, True, False]
[False, False, True]

Next we can use the face_distance() function to calculate the Euclidean distance of the each known face encoding to the ones in the new image. By running face_distances = face_recognition.face_distance(face_encodings, face_encoding) we get back a set of Euclidean distances which provide a mathematical way of telling you how similar the faces are. Then we can use NumPy’s argmin() function to return the minimum values along the axis, then loop through the matches, name the best match and draw on a labelled box. Here’s the block of code in full.

Note that I’ve also used the ImageFont Class from Pillow to let me add a clearer label. You don’t need to do this, but Pillow default’s to using a Bitmap font which doesn’t scale. Using a True Type Font gives better clarity. You may need to adjust the font path to point to the right font on your system.

import numpy as np
import face_recognition
from PIL import Image, ImageDraw, ImageFont

face1 = face_recognition.load_image_file("woman-1.jpg")
face1_encoding = face_recognition.face_encodings(face1)[0]

face2 = face_recognition.load_image_file("woman-1.jpg")
face2_encoding = face_recognition.face_encodings(face2)[0]

face3 = face_recognition.load_image_file("woman-3.jpg")
face3_encoding = face_recognition.face_encodings(face3)[0]

face_encodings = [face1_encoding, face2_encoding, face3_encoding]
face_names = ['Rebecca', 'Natasha', 'Sarah']

image_array = face_recognition.load_image_file("three-women-cropped.jpg")
face_locations = face_recognition.face_locations(image_array)
face_encodings = face_recognition.face_encodings(image_array, face_locations)

image = Image.fromarray(image_array)
draw = ImageDraw.Draw(image)

for (top, right, bottom, left), face_encoding in zip(face_locations, face_encodings):

    matches = face_recognition.compare_faces(face_encodings, face_encoding)
    name = "Unknown person"

    face_distances = face_recognition.face_distance(face_encodings, face_encoding)
    best_match_index = np.argmin(face_distances)

    if matches[best_match_index]:
        name = face_names[best_match_index]

    # Draw a box around the face
    draw.rectangle(
        ((left, top), (right, bottom)), 
        outline=(50, 181, 201),
        width=5,
    )

    # Add a label to the box    
    text_width, text_height = draw.textsize(name)

    draw.rectangle(
        ((left, bottom - text_height - 30), 
        (right, bottom)), 
        fill=(50, 181, 201), 
        outline=(50, 181, 201)
    )

    font = ImageFont.truetype(r'/usr/share/fonts/truetype/Lato-Semibold.ttf', 30) 
    draw.text(
        (left + 6, bottom - text_height - 25), 
        name, 
        font = font,
        fill=(255, 255, 255, 255)
    )


display(image)

png

Facial feature recognition

The other neat thing you can do with Dlib and Face Recognition is identify specific facial features, such as the exact positions of the eyes, chin, mouth, nose, eyebrows, and lips. To identify facial features, all you need to do is load the image into a NumPy array again using load_image_file() and then pass the array to face_landmarks(). This will return a Python list containing the a dictionary of facial features and their coordinates.

from PIL import Image, ImageDraw
import face_recognition

image_array = face_recognition.load_image_file("man-in-shirt.jpg")
face_landmarks_list = face_recognition.face_landmarks(image_array)
for face_landmarks in face_landmarks_list:
    for facial_feature in face_landmarks.keys():
        print("{}: {}".format(facial_feature, face_landmarks[facial_feature]))
chin: [(782, 311), (778, 343), (780, 377), (787, 409), (802, 440), (823, 464), (848, 486), (876, 502), (907, 506), (936, 503), (962, 488), (985, 467), (1004, 443), (1017, 413), (1025, 382), (1028, 350), (1027, 320)]
left_eyebrow: [(810, 273), (823, 254), (846, 248), (870, 252), (892, 263)]
right_eyebrow: [(937, 265), (959, 256), (982, 254), (1004, 264), (1015, 284)]
nose_bridge: [(913, 292), (912, 314), (911, 334), (911, 357)]
nose_tip: [(882, 371), (895, 375), (909, 379), (924, 375), (937, 372)]
left_eye: [(833, 298), (846, 292), (862, 293), (879, 302), (862, 304), (846, 303)]
right_eye: [(943, 304), (961, 295), (977, 296), (989, 303), (977, 307), (961, 306)]
top_lip: [(855, 409), (878, 404), (897, 402), (910, 405), (925, 403), (943, 406), (964, 412), (955, 413), (925, 413), (910, 414), (896, 412), (864, 411)]
bottom_lip: [(964, 412), (944, 425), (925, 430), (910, 431), (895, 428), (878, 423), (855, 409), (864, 411), (896, 411), (910, 413), (925, 413), (955, 413)]

As we saw earlier, having access to the coordinates means you can then use Pillow to crop certain features out of the images or draw upon the faces in specific locations. If you loop over the face_landmarks_list again and extract the keys() from each landmark you can use Pillow’s line() function to draw them in.

image = Image.fromarray(image_array)
draw = ImageDraw.Draw(image)

for face_landmarks in face_landmarks_list:
    for facial_feature in face_landmarks.keys():
        draw.line(face_landmarks[facial_feature], width=5, fill=(50, 181, 201))

display(image)

png

Adding facial features

You can probably see the power of this now. Knowing where facial features are located also means you can superimpose items onto them. For example, maybe you’re building an application for an optician’s website and you want to show customers what glasses will look like on their face. By identifying the positions of the eyes and the level of skewing in the image, you can position the glasses perfectly to give them a preview. Or, maybe you just want to put a dog nose and ears on someone as Snapchat’s filter does. Instead, let’s draw some funny eyebrows on a baby.

from PIL import Image, ImageDraw
import face_recognition

image_array = face_recognition.load_image_file("baby.jpg")
face_landmarks_list = face_recognition.face_landmarks(image_array)

image = Image.fromarray(image_array)
draw = ImageDraw.Draw(image)

colour = (92, 64, 51)

for face_landmarks in face_landmarks_list:
    for facial_feature in face_landmarks.keys():

        draw.polygon(face_landmarks['left_eyebrow'], fill=(colour))
        draw.polygon(face_landmarks['right_eyebrow'], fill=(colour))
        draw.line(face_landmarks['left_eyebrow'], fill=(colour), width=15)
        draw.line(face_landmarks['right_eyebrow'], fill=(colour), width=15)

display(image)

png

Improving the performance and accuracy of facial recognition

The standard HOG-based model is pretty good but it doesn’t always recognise faces, especially if they’re partly concealed, blurred, or at a jaunty angle. If you check out the excellent documentation, the Module contents section lists some useful arguments you can pass to the functions we’ve used above that can help improve performance.

In addition, the face_recognition package does include other more advanced models, including a Convolutional Neural Network (CNN) model that you can run via the find_faces_in_picture_cnn() function. To use the CNN model, you’ll really need a CUDA enabled GPU which is correctly configured with the NVIDIA CuDNN libraries, and they’ll need to be configured when you compile Dlib with CMake. The HOG model is quickest on the CPU, but not as accurate as the CNN.

Enabling the CNN model is as easy as passing in an additional argument to face_locations(). For example, face_locations = face_recognition.face_locations(image_array, model="cnn").

Matt Clarke, Tuesday, March 02, 2021

Matt Clarke Matt is an Ecommerce and Marketing Director who uses data science to help in his work. Matt has a Master's degree in Internet Retailing (plus two other Master's degrees in different fields) and specialises in the technical side of ecommerce and marketing.