How to use SMOTE for imbalanced classification

Imbalanced classification problems, such as the detection of fraudulent card payments, represent a significant challenge for machine learning models. When the target class, such as fraudulent transactions, makes up such...

How to use Recursive Feature Elimination in your models using RFECV

Something which often confuses non data scientists is that too many features can be a bad thing for a model. It does sound logical that including more features and data...

How to use model selection and hyperparameter tuning

There are many techniques you can apply to improve the performance of your machine learning models, but two of the most powerful are model selection and hyperparameter tuning. As models...

How to use transform categorical variables using encoders

There are loads of different ways to convert categorical variables into numeric features so they can be used within machine learning models. While you can perform this process manually on...

How to select, filter, and subset data in Pandas dataframes

Selecting, filtering and subsetting data is probably the most common task you’ll undertake if you work with data. It allows you to extract subsets of data where row or column...

How to save and load machine learning models using Pickle

Machine learning models often take hours or days to run, especially on large datasets with many features. If your machine goes off, you’ll lose your model and you’ll need to...

How to resample time series data in Pandas

When working with time series data, such as web analytics data or ecommerce sales, the time series format in your dataset might not be ideal for the analysis you’re performing...

How to reformat dates in Pandas

If you regularly work with time series data in Pandas it’s probable that you’ll sometimes need to convert dates or datetimes and extract additional features from them.

How to import data into Pandas dataframes

Pandas allows you to import data from a wide range of data sources directly into a dataframe. These can be static files, such as CSV, TSV, fixed width files, Microsoft...