How to build the 'Hotdog , not Hotdog' image classifier

Convolutional Neural Networks or CNNs are one of the most widely used AI techniques for detecting complex features in data. They’re particularly good for image recognition, and are used in...

How to create a neural network for sentiment analysis

Sentiment analysis, or opinion mining, is a form of emotion AI and uses natural language processing and computational linguistics to analyse text and infer the sentiment. Sentiment analysis has loads...

How to use GAPandas to view your Google Analytics data

Over the past decade I’ve written more Google Analytics API queries than I can remember. Initially, I favoured PHP for these (and still do for permanent web-based applications utilising GA...

How to use your GPU to accelerate XGBoost models

If you’re not fortunate enough to have a really powerful data science workstation for your work, one of the problems you’ll likely face is that your models can take quite...

How to use scikit-learn datasets in data science projects

The scikit-learn package comes with a range of small built-in toy datasets that are ideal for using in test projects and applications. As they’re part of the scikit-learn package, you...

How to use Python regular expressions to extract information

Regular expressions are used for pattern matching in programming, allowing you to identify or extract very specific pieces of text from a string or document. They’re very powerful and extremely...

How to use mean encoding in your machine learning models

When you’re building a machine learning model, the feature engineering step is often the most important. From your initial small batch of features, the clever use of maths and stats...

How to interpret the confusion matrix

As a practical demonstration of how the confusion matrix works, lets load up the Wisconsin Breast Cancer dataset, create a classification model and examine the confusion matrix to see how...

How to impute missing numeric values in your dataset

As models require numeric data and don’t like NaN, null, or inf values, if you find these within your dataset you’ll need to deal with them before passing the data...