Search

Search the Practical Data Science blog for hundreds of free tutorials on data science, machine learning, and data engineering.

Search

Categories

Tags

Results

How to create a Google rank checker tool using Python

Learn how to create a Google rank checker tool using the Python EcommerceTools package so you can...

How to use the Feefo API for ecommerce competitor analysis

Learn how to use the Feefo API for ecommerce competitor analysis and understand what products com...

How to compare time periods using the Google Search Console API

Learn to use EcommerceTools to query the Google Search Console API with Python and compare two ti...

How to create a Google Service Account client secrets JSON key

Learn how to create a client secrets JSON key file and set up a Google Service Account so you can...

A quick guide to the RFM model for data scientists

The RFM model measures Recency, Frequency, and Monetary value and is used to predict future custo...

How to run time-based SEO tests using Python

Learn how to run SEO tests in Python using EcommerceTools to fetch your Google Search Console dat...

How to create content recommendations using TF IDF

Learn how to use the Term-Frequency Inverse Document Frequency (TF IDF) and cosine similarity to ...

How to create a contractual churn model in scikit-learn

Whether you sell magazine subscriptions, mobile phone contracts, broadband, or car insurance, con...

How to avoid model overfitting with early stopping rounds

Overfitting reduces model performance. Here's how you can avoid it using the XGBoost early stoppi...

A quick guide to customer segmentation for B2B e-commerce

The approaches used for B2B customer segmentation are slightly different from those used in B2C e...

How to detect Google Search Console anomalies

Learn how to use Python to export data from the Google Search Console API using Python and constr...

How to classify customer support tickets using Naive Bayes

Improve the efficiency of your customer service team by creating a Naive Bayes model to classify ...

How to use pipelines in your machine learning models

Using pipelines keeps machine learning code cleaner, easier to maintain, easier to move to produc...

How to infer the effects of marketing using the Causal Impact model

The Causal Impact model lets you examine ecommerce and marketing time series data to understand w...

How to identify SEO keyword opportunities with Python

Learn how to use Python to scrape and parse an XML sitemap, crawl and scrape a site, connect the ...

How to add days and subtract days from dates in Pandas

Learn how to add days and subtract days from dates in Pandas using the Python timedelta function ...

How to analyse Google Analytics demographics and interests with GAPandas

Google Analytics demographics and interests data are a useful way to quickly understand the custo...

How to identify striking distance keywords with Python

Learn how to find striking distance keywords in Google Search Console API data with Python and im...

A quick guide to lead scoring for B2B e-commerce sites

Lead scoring is a commonly used CRM technique in most B2B e-commerce sites. Here's how the variou...

How to trigger marketing automations using the Mailchimp API in Python

By assigning or removing tags to subscribers using the Mailchimp marketing API you can create pow...

How to create monthly Google Search Console API reports with EcommerceTools

Learn how to use EcommerceTools to create monthly Google Search Console API reports that let you ...

How to engineer new features using Decision Tree models

Learn how to use Decision Trees to engineer or derive new features from your existing data and im...

How to use the Mailchimp Marketing Python API with Pandas

Learn how to use the Mailchimp API in Python by creating email marketing reports using Pandas and...

How to use the eBay Finding API with Python

The eBay SDK allows developers to search and retrieve eBay listings using a Python API. Here's ho...

How to export Zendesk tickets into Pandas using Zenpy

Zenpy is an unofficial Zendesk API for Python that allows you to export and update tickets. Here'...

How to query the Google Search Console API with EcommerceTools

EcommerceTools makes it quick and easy to query the Google Search Console API and display the dat...

How to read Google Sheets data in Pandas with GSpread

The GSpread package makes it quick and easy to read Google Sheets spreadsheets from Google Drive ...

How to calculate the Lin Rodnitzky Ratio using GAPandas

The Lin Rodnitzky Ratio is designed to assess paid search account management quality. Here's how ...

How to analyse product replenishment

By identifying products that are regularly replenished and contacting customers when they are run...

Data science courses for budding data scientists and data engineers

If you are considering learning data science there are now hundreds of online data science course...

A quick guide to customer segmentation for data scientists

Customer segmentation is the process of grouping customers based on shared characteristics in ord...

How to create an ecommerce purchase intention model in Python

Ecommerce purchase intention models predict the probability of each customer making a purchase, s...

How to create a classification model using XGBoost in Python

Learn how to create a classification model using XGBoost and scikit-learn in Python by classifyin...

How to predict employee churn using CatBoost

Use CatBoost to create an employee churn model that will predict which of your staff is going to ...

How to read an RSS feed in Python

Learn how to use create a Python RSS reader using Requests-HTML to read an RSS feed, parse the fe...

How to auto-generate meta descriptions with EcommerceTools

Learn how to use EcommerceTools to create automated meta descriptions via deep learning and the B...

19 Python SEO projects that will improve your site

Using Python for SEO is really catching on. There are loads of ways you can use Python SEO projec...

How to identify internal and external links using Python

Learn how to identify internal and external links through web scraping in Python and help identif...

How to create a basic Marketing Mix Model in scikit-learn

Marketing Mix Models (MMMs) let you test what marketing results you'll get from changing the amou...

How to scrape Google results in three lines of Python code

Here's a quick and easy way to scrape Google search engine results into a Pandas dataframe in jus...

How to make time series forecasts with Neural Prophet

The Neural Prophet time series forecasting model was developed by Facebook and is a powerful tool...

How to create a simple product recommender system in Pandas

Learn how to create a product recommender or product recommendation system in Python using Pandas...

15 ways you can use data science to boost ecommerce performance

There are dozens of use cases for ecommerce data science, covering everything from segmentation t...

How to create PDF reports in Python using Pandas and Gilfoyle

To save time, I created a Python package for generating PDF reports and presentations. Here's how...

How to create monthly Google Analytics reports in Pandas

Here's how you can use GAPandas to create monthly analytics reports on marketing and ecommerce da...

How to segment your customers using EcommerceTools

EcommerceTools makes it quick and easy to segment your customers using a range of powerful techni...

How to use EcommerceTools for technical SEO

The EcommerceTools package lets you check SERPs, examine robots.txt files, analyse Core Web Vital...

How to use the Isolation Forest model for outlier detection

Learn how to use the Isolation Forest or iForest algorithm in sklearn for automated outlier detec...

How to use k means clustering for customer segmentation

K means clustering is the most widely used machine learning algorithm and is well-suited to custo...

How to segment customers using RFM and ABC

Creating a value-based segmentation using RFM and ABC is a great way to tell the good customers f...

How to perform a customer cohort analysis in Pandas

Customer cohort analysis examines differences between customers over time and is a powerful tool ...

How to machine translate product descriptions

Machine translation systems, such as Google Translate, make it quick and easy to bulk translate p...

How to identify the causes of customer churn

Discover how to identify the causes of customer churn using Cox's Proportional Hazards model, so ...

How to identify near duplicate content using LMS

Learn how to detect near duplicate content using the Longest Matching Subsequence (LMS) technique...

How to create ecommerce anomaly detection models

Learn how to use the Anomaly Detection Toolkit (ADTK) to identify anomalies in ecommerce data ext...

How to create a product and price metadata scraper

Learn how to keep tabs on your competitors' pricing by building an ecommerce price scraper that u...

How to create a non-contractual churn model for ecommerce

Learn how to create a non-contractual churn model to let you predict churn and identify which cus...

How to classify customer service emails with Bart MNLI

Make your ecommerce customer service team more efficient by classifying their support emails auto...

How to calculate CLV using BG/NBD and Gamma-Gamma

Calculating Customer Lifetime Value is hard to do right. Learn how to calculate CLV using the BG/...

How to auto-generate product summaries using deep learning

Learn how to use Transformer models to automatically generate summaries from ecommerce product de...

How to assign RFM scores with quantile-based discretization

Use quantile-based discretization and K-means clustering to calculate RFM scores to your customer...

How to assess product copy using EQA models

Learn how to use Extractive Question Answering or EQA models to assess the quality of your ecomme...

How to analyse product consumption and repurchase rates

Learn how to shape your product, content, and pricing strategy by analysing product consumption a...

How to use Spintax to create content and ad copy in Python

Although Spintax was mainly used for the production of low quality articles, and emails from Nige...

How to use bagging, boosting, and stacking in ensembles

Learn how to use ensemble models that utilise bagging, boosting, and stacking to generate better ...

How to scrape schema.org metadata using Python

Learn to scrape more efficiently by extracting Schema.org metadata in JSON-LD, Microdata, and Ope...

How to scrape People Also Ask data using Python

Google’s People Also Ask or PAA boxes are increasingly common for popular search terms and are wo...

How to scrape Google search results using Python

Learn to scrape Google search results using Python and save loads of time and collect data that a...

How to perform time series decomposition

Time series decomposition lets you separate the trend and seasonality in your data so you can see...

How to join Google Analytics and Google Search Console data

Learn how to use Python to connect your Google Search Console API data to your Google Analytics R...

How to identify SEO keywords using Google Autocomplete

Learn how to use Python to identify the most popular SEO keywords linked to your search term by s...

How to find spelling and grammar issues on product pages

Spelling and grammar issues on product detail pages can make your site look unprofessional. Here’...

How to engineer customer purchase latency features

Learn how to engineer customer purchase latency features based on the time between each customer'...

How to create targeted B2B company sector datasets

Learn how to create targeted B2B company datasets for free using Python, Pandas, and Companies Ho...

How to create a UK data science jobs dataset

Want to analyse the data science and data engineering job market? Here's a quick guide to buildin...

How to create a product matching model using XGBoost

Product matching algorithms find identical products on ecommerce sites so users can compare produ...

How to create a Naive Bayes product classification model

Learn how to use NLP techniques to create a Multinomial Naive Bayes sklearn product classificatio...

How to create a dataset containing all UK companies

B2B ecommerce retailers spend large amounts on acquiring the addresses of potential customers to ...

How to count indexed pages using Python

Learn how to use Python to count the number of indexed pages a website has to help you monitor it...

How to calculate safety stock and reorder point

The safety stock calculation and reorder point calculation can greatly reduce the likelihood of c...

How to calculate operations management metrics in Python

Understand the most important metrics for operations managers and learn how to calculate them in ...

How to calculate marketing metrics in Python

Learn how to calculate marketing metrics such as CPM, CPC, conversion rate, ROMI, ROI, ROAS, CPO,...

How to calculate customer experience metrics in Python

Customer experience metrics and customer satisfaction metrics drive customer retention, so it's v...

How to calculate category management metrics in Python

Category management metrics can let you understand product sales and be more strategic in your pr...

How to access the Google Knowledge Graph Search API

The Google Knowledge Graph powers the Knowledge Panels and infobox elements of Google’s search re...

A quick guide to catalogue marketing data science

Catalogues may be living on borrowed time, but catalogue marketing data science techniques have b...

How to use knee point detection in k means clustering

Use the Kneedle algorithm to detect the knee or elbow point when k means clustering so you define...

How to use Extruct to identify Schema.org metadata usage

Extruct allows you to reveal a site's Schema.org metadata implementation, so you can build a more...

How to unzip files with Python

If you're downloading large zipped datasets via automated Python scripts, you may need to unzip o...

How to unserialize serialized PHP arrays using Python

PHP serialized arrays and objects are common in ecommerce database schemas. This is how you unser...

How to send data to Google Analytics in Python with PyGAMP

PyGAMP allows you to insert data into Google Analytics using the Measurement Protocol API in Pyth...

How to scrape Open Graph protocol data using Python

Learn how to use web scraping technologies, including urllib and Beautiful Soup, to scrape a webs...

How to scrape and parse a robots.txt file using Python

The robots.txt file includes potential useful information for crawlers and spiders, and is easy t...

How to scrape a site's page titles and meta descriptions

Learn how to apply web scraping tools to scrape a site's content and parse the page titles and me...

How to scan a site for 404 errors and 301 redirect chains

404 errors and 301 redirect chains can be damaging to the performance of a website and impact the...

How to resize and compress images with TinyPNG

Learn how to use the TinyPNG API in Python to bulk resize and compress images to improve site per...

How to preprocess text for NLP in four easy steps

Learn how to apply tokenization, stopword removal, Porter stemming, and re-joining to preprocess ...

How to parse XML sitemaps using Python

XML sitemaps are a great way to gain insight on your competitors’ websites and identify pages to ...

How to parse URL structures using Python

When analysing web data, it’s common to need to parse URLs and extract the domain, directories, q...

How to identify keyword cannibalisation using Python

Learn how to use Python to identify keyword cannibalisation which occurs when multiple pages comp...

How to download files with Python

The Python urllib package allows you to download files from remote servers to use in your project...

How to detect sarcasm using machine learning

Can you tell when someone is taking the piss, when they haven't used a winking smiley? In this pr...

How to detect fake news with machine learning

Learn the Natural Language Processing techniques you need to use to identify fake news from real ...

How to calculate Economic Order Quantity in Python

Learn how to calculate the Economic Order Quantity or EOQ for a product to minimise holding costs...

How to build a web scraper using Requests-HTML

Requests-HTML wraps up the best bits from Requests and Beautiful Soup packages to create a web sc...

How to audit a site's Core Web Vitals using Python

Core Web Vitals are performance metrics that measure the quality of the user experience and are n...

How to analyse Pandas dataframes using SQL with PandaSQL

Learn how to use PandaSQL and query the data in your Pandas dataframes using SQL queries instead ...

How to analyse non-ranking pages and search index bloat

Learn how you can use Python to identify how many non-ranking pages your site has and check wheth...

How to access the Google Search Console API using Python

By accessing Google Search Console API data using Python you'll have access to whatever data you ...

A quick guide to search intent classification for SEO

Search intent classification aims to categorise search queries by user intent. But how do you do ...

How to use Screaming Frog from the command line

The Screaming Frog SEO Spider is widely used in digital marketing and ecommerce and has a powerfu...

How to send a Slack message in Python using webhooks

It’s really easy to send Slack messages using Python. In this project, we’ll create a really basi...

How to geocode and map addresses using GeoPy

Learn how to use GeoPy, Nominatim, and Folium to geocode and plot Pizza Express branches in the v...

How to create paid search keywords using Pandas

Pandas is a powerful tool for marketers, especially those involved in paid search advertising. He...

How to create a Python web scraper using Beautiful Soup

Beautiful Soup is one of the most powerful libraries for performing web scraping in Python. Here'...

The difference between data scientists and data engineers

Despite the growing demand, many people still don’t understand the difference between a data scie...

How to write better code using DRY and Do One Thing

Learn how to use the Don’t Repeat Yourself and Do One Thing techniques to help you create Python ...

How to visualise data with quirky hand-drawn plots

Want to dumb-down your plots and charts for your target audience? CuteCharts allows you to create...

How to visualise conversion funnels with Plotly

Funnels are one of the most useful and intuitive data visualisations used in ecommerce and market...

How to use style guidelines to improve your Python code

Learn how and why following Python style guidelines can make your code easier to understand, revi...

How to use SQLite in Python

The SQLite relational database management system is fast, lightweight, and easy to use. Here's ho...

How to use operators in Python

Python operators are one of the most important components of the language to grasp. Here’s a basi...

How to use lists in Python

Lists are one of the most widely used data storage objects or data types within Python and are us...

How to use Git for your data science projects

Learn how to use Git for your data science projects so you can keep your code backed-up and share...

How to use docstrings to improve your Python code

Using docstrings in Python makes it easier to see what functions do, what arguments they accept, ...

How to use the Pandas value_counts() function

The Pandas value_counts() function is great for calculating the number of occurrences of a value ...

How to query MySQL and other databases using Pandas

Querying MySQL and other databases using Pandas in Jupyter notebooks will change the way you work...

How to open, read, and write to files in Python

Python makes it very straightforward to open, read, and write data to files. Here's a quick guide...

The four Python data science libraries you need to learn

If you're learning data science, there are four Python data science libraries you absolutely need...

How to visualise text data using word clouds in Python

Word clouds, tag clouds, or wordles are an intuitive way to present text data to non-technical pe...

How to visualise statistical distributions with Seaborn

Understanding the statistical distribution of data is a crucial step in machine learning. Here’s ...

How to visualise data using Venn diagrams in Matplotlib

The Venn diagram is one of the most intuitive data visualisations for showing the overlap between...

How to visualise data using line charts in Seaborn

Line charts or line plots are among the most commonly used graphs in data science. Here’s how you...

How to visualise data using barplots in Seaborn

Learn how to create barplots or bar charts for comparing and visualising categorical data in Pyth...

How to visualise correlations using Pandas and Seaborn

Machine learning models make predictions from correlations between features and the target, so fi...

How to visualise categorical data in Seaborn

There’s more to visualising categorical data than bar charts. Here’s a selection of the other cha...

How to install the NVIDIA Data Science Stack on Ubuntu 20.04

The NVIDIA Data Science Stack is the quickest way to setup the drivers and packages needed for GP...

How to create desktop data science apps using Nativefier

Nativefier makes it easy to create Ubuntu desktop applications from websites using Electron. Here...

How to create an Ubuntu desktop entry to run Jupyter

Here's how you can create a Gnome desktop entry shortcut launcher icon to start up Docker and ope...

How to create a dataset for product matching models

Datasets for the product matching models required to verify price comparisons are hard to find. H...

How to build a data science workstation

Building your own data science workstation or deep learning workstation isn’t that difficult and ...

How to visualise analytics data using heatmaps in Seaborn

Heatmaps make visualising temporal data much easier. Here’s how you can create custom web analyti...

How to visualise RFM data using treemaps

Learn how to assign simple labels to your RFM data and visualise them using treemaps to help make...

How to visualise data using scatterplots in Seaborn

Scatterplots are a great way to visualise the distribution of data and the relationship between t...

How to visualise data using histograms in Pandas

Pandas histograms are one of the best ways to visualise the statistical distributions of data dur...

How to visualise data using boxplots in Seaborn

The Seaborn boxplot, or box-and-whisker diagram, is a great way to visualise the statistical dist...

How to use SMOTE for imbalanced classification

SMOTE, the Synthetic Minority Oversampling Technique, is one of the best ways to handle imbalance...

How to use Recursive Feature Elimination in your models using RFECV

Matt Clarke explains how you can use Recursive Feature Elimination with Cross Validation or RFECV...

How to use model selection and hyperparameter tuning

Model selection and hyperparameter tuning can greatly improve model performance. Learn how to use...

How to use transform categorical variables using encoders

Learn how to use Category Encoders to transform and convert categorical variables to numeric data...

How to select, filter, and subset data in Pandas dataframes

Learn a range of useful techniques to select, filter, and subset data stored in Pandas dataframes...

How to save and load machine learning models using Pickle

Machine learning models can take days to train. Pickle save and Pickle load allows you to save th...

How to resample time series data in Pandas

The Pandas resample function lets you group time series data by day, week, month, or year so it c...

How to reformat dates in Pandas

Learn how to use Python and Pandas to reformat dates and datetimes so you can display them in you...

How to import data into Pandas dataframes

Learn how to import data into Pandas from a wide range of different data sources, from CSV and Ex...

How to group and aggregate transactional data using Pandas

Learn how to group and aggregate transactional data using Pandas to create new datasets allowing ...

How to create ecommerce sales forecasts using Prophet

Creating accurate ecommerce time series forecasts using models such as ARIMA can be tricky. The P...

How to create a response model to improve outbound sales

Learn how to improve outbound sales using a machine learning response model that maximises your s...

How to create a linear regression model using Scikit-Learn

Want to get started with sklearn linear regression? Learn to use Python, Pandas, and scikit-learn...

How to analyse search traffic using the Google Trends API

Google Trends data is now being used in a range of models. Here’s how you can access the data usi...

How to use identify visually similar images using hashing

Learn how to use image hashing or image fingerprinting to find visually similar images or duplica...

How to create an ABC XYZ inventory classification model

The ABC XYZ inventory classification model is built on top of ABC inventory analysis and helps yo...

How to use Google Secret Manager to improve data security

Learn how to use Google Secret Manager to create secure environmental variables to hold your sens...

How to speed up the NLP text annotation process

Text annotation techniques like sequence labeling are vital in NLP, but are tedious, time-consum...

How to import data into Google Data Studio using Python

Google Data Studio doesn’t include native support for Python, but you can still import data from ...

How to import data into BigQuery using Pandas and MySQL

Learn how to import data into the Google BigQuery serverless data warehouse platform using Python...

How to create synthetic data sets for machine learning

Learn some simple techniques you can apply using Pandas and Numpy to create dummy, synthetic, or ...

How to create image datasets for machine learning models

Learn how to create image datasets for machine learning image classification models using Python ...

How to create an ABC inventory classification model

Learn how to create an ABC inventory classification model in Python so your procurement manager t...

How to connect to MySQL via an SSH tunnel in Python

MySQL databases are usually configured to only allow secure connections via SSH. Here’s how to cr...

How to calculate relative dates for Google Analytics queries

To automate Google Analytics API reports for Google Data Studio you’ll need to know how to calcul...

How to bin or bucket customer data using Pandas

Data binning or bucketing is a very useful technique for both preprocessing and understanding or ...

How to annotate training data for NLP models using Doccano

Doccano is a text annotation platform for NLP that makes it much quicker and easier to label and ...

Ecommerce and marketing data sets for machine learning

Here’s a selection of some of the most useful datasets I’ve found for building machine learning m...

How to use the BG/NBD model to predict customer purchases

The Beta-Geometric Negative Binomial Distribution or BG/NBD model lets you predict which customer...

How to use NLP to identify what drives customer satisfaction

Learn how to use web scraping and NLP to shape your ecommerce strategy by identifying what influe...

How to create a BI platform using Apache Superset

Learn how to create a powerful and extensible business intelligence (BI) platform for your ecomme...

How to use Apache Druid for real-time analytics data storage

Apache Druid is a real-time high performance analytics data store for big data that makes running...

How to set up a Docker container for your MySQL server

Learn how to create a Docker container for your MySQL or MariaDB database server so you can extra...

How to use Category Encoders to encode categorical variables

Category Encoders make it much easier to encode categorical variables during the machine learning...

How to create ecommerce data pipelines in Apache Airflow

Learn how to create an Apache Airflow data pipeline and see why it is one of the most widely used...

How to create an ecommerce trading calendar using Pandas

Learn how to use Pandas to create a dynamic ecommerce trading calendar of special trading events,...

Dell Precision 7750 mobile data science workstation review

The Dell Precision 7750 mobile workstation is aimed at data scientists who want a laptop for GPU-...

A quick guide to Product Attribute Extraction models

Learn why ecommerce retailers and marketplaces are creating Product Attribute Extraction (PAE) mo...

A quick guide to Next-Product-To-Buy models

Next-Product-To-Buy or NPTB models can predict not only what a customer will buy, but also when t...

A quick guide to machine learning

Machine learning (ML) is a branch of artificial intelligence (AI) and allows models to make predi...

A quick guide to machine learning uplift models

Unlike response or propensity models, uplift models let you identify customers who will only buy ...

A quick guide to Learning to Rank models

Learning to Rank or LTR models improve the performance of on-site search results on ecommerce web...

How to use the Pandas melt function to reshape wide format data

Learn how to use the Pandas melt function to reshape wide format data so you can use it in your m...

How to use the Apriori algorithm for Market Basket Analysis

Learn how to use the mlxtend Apriori algorithm to run a Market Basket Analysis on Google Analytic...

How to use Natural Language Understanding models

Learn how to use Natural Language Understanding models (NLU) via PyTorch and Hugging Face Transfo...

How to use Docker for your data science projects

Learning to use Docker for data science projects will make configuring, deploying, and sharing mo...

How to tune model hyper-parameters with grid search

Every scikit-learn model has hyper-parameters you can tune to obtain improvements. Here’s how to ...

How to test your Keras, CUDA, CuDNN, and TensorFlow install

Setting up TensorFlow, Keras, CUDA, and CuDNN can be a painful experience on Ubuntu 20.04. Here i...

How to scrape JSON-LD competitor reviews using Extruct

Here's how you can use Python, Selenium, and Extruct to create a headless web browser and scrape ...

How to scrape competitor technology data in Python

Learn how to automate the collection of website technology data from your competitors using Built...

How to perform facial recognition in Python

Facial recognition is now very effective and has become part of everyday life. Here's how to use ...

How to separate audio source data using Spleeter

Learn how to use Deezer's TensorFlow powered Spleeter model to separate music into vocals and acc...

How to create a Pandas dataframe

Pandas lets you create dataframes from almost any type of data, including lists, dictionaries, tu...

How to create a collaborative filtering recommender system

Learn how to use item-based and user-based collaborative filtering to create a powerful recommend...

How to build the 'Hotdog , not Hotdog' image classifier

Learn how to classify images using Keras and TensorFlow by building the 'Hotdog, Not Hotdog' Conv...

How to create a neural network for sentiment analysis

Learn how to use a recurrent neural network and the Long Short-Term Memory model to analyse senti...

How to use GAPandas to view your Google Analytics data

Learn how to use GAPandas to query the Google Analytics API and view, analyse, and visualise your...

How to use your GPU to accelerate XGBoost models

Do your XGBoost machine learning models take an age to run? You could make them several times fas...

How to use scikit-learn datasets in data science projects

To learn data science techniques you’ll need the right kind of datasets. Thankfully, many are eas...

How to use Python regular expressions to extract information

Regular expressions, or regexes, are widely used in data science for matching specific patterns i...

How to use mean encoding in your machine learning models

Learn how to use the mean encoding technique to generate powerful new features from your data to ...

How to interpret the confusion matrix

The confusion matrix can tell you more about your model than the accuracy score. We build a model...

How to impute missing numeric values in your dataset

Cleverly filling in the gaps when numeric data is missing from your dataset can often boost the p...

How to engineer date features using Pandas

In time series datasets dates often hold the key to improving performance, but they need to be tr...

How to create a Python virtual environment for Jupyter

Learn how to create a Python virtual environment for your Jupyter notebook using venv and virtual...