How to use Optuna for XGBoost hyperparameter tuning

Picture by Vlad Bagacian, Pexels.

10 minutes to read

Over the past year or so, the Optuna package has quickly become a favourite among data scientists for hyperparameter tuning on machine learning models, and for good reason. It’s lightweight, easy-to-use, very efficient for optimising hyperparameters, and it’s much faster than other tools like GridSearchCV.

Unlike GridSearchCV, Optuna doesn’t require you to specify a grid of hyperparameter values to search over. Instead, you specify the range of values for each hyperparameter, and Optuna will search over that range to find the optimal values. This makes it much more efficient than GridSearchCV, which can take a long time to run if you have a large number of hyperparameters to tune.

In this article, we’ll look at how to use Optuna for XGBoost hyperparameter tuning by tuning model parameters on an XGBClassifier model.

Install the packages

To get started, open a Jupyter notebook and use Pip to install the Optuna package, and XGBoost, if you don’t have them installed already.

!pip3 install optuna

Hide warnings

Optuna currently throws some warnings about forthcoming deprecations, so we’ll hide them for now, as they don’t affect the functionality of the package.

import warnings
warnings.filterwarnings('ignore')

Load the packages

Next, import the packages. We’ll be using the XGBoost classifier, and the Optuna package, plus some scikit-learn packages for model evaluation. You can use any dataset you like, but for simplicity I’m using the wine classification dataset from scikit-learn, as it will allow us to skip out the data preprocessing step.

from xgboost import XGBClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, classification_report
from sklearn.datasets import load_wine
import optuna

Load the data

Load the data, and split it into X and y variables using the return_X_y parameter. Set as_frame=True to return the data as a Pandas dataframe, which will make it easier to work with.

X, y = load_wine(return_X_y=True, as_frame=True)

Split the data

Now split the data into training and test sets using the train_test_split function from scikit-learn. We’ll set the random_state parameter to 1, so that we get the same results every time we run the code, and we’ll allocate 30% of the data to the test set using test_size=0.3.

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=1)

Fit the base model

To see what sort of scores we can achieve with the XGBClassifier model from XGBoost we will first fit a simple base model. We’ll set the label_encoder to False and we’ll set the eval_metric to mlogloss to get the log loss score.

model = XGBClassifier(use_label_encoder=False, 
                      eval_metric='mlogloss')
model.fit(X_train, y_train)

Make predictions

Now the base model has been trained, we can make predictions on the test set and evaluate the model.

y_pred = model.predict(X_test)

Evaluate predictions

We’ll use the accuracy_score function from scikit-learn to get the accuracy score, and the classification_report function to get the precision, recall and F1 scores. We get back an accuracy score of 96.30%, which is pretty good.

accuracy = accuracy_score(y_test, y_pred)
print("Accuracy: %.2f%%" % (accuracy * 100.0))

Accuracy: 96.30%

print(classification_report(y_test, y_pred))

              precision    recall  f1-score   support

           0       0.92      1.00      0.96        23
           1       1.00      0.89      0.94        19
           2       1.00      1.00      1.00        12

    accuracy                           0.96        54
   macro avg       0.97      0.96      0.97        54
weighted avg       0.97      0.96      0.96        54

Create an Optuna objective function

Next, we’ll use Optuna to tune the hyperparameters of the XGBoost model. We’ll start by creating an objective function, which will be passed to the study.optimize function. The objective function will take the trial parameter, which is an instance of the Trial class, and will return the accuracy score.

def objective(trial):
    """Define the objective function"""

    params = {
        'max_depth': trial.suggest_int('max_depth', 1, 9),
        'learning_rate': trial.suggest_loguniform('learning_rate', 0.01, 1.0),
        'n_estimators': trial.suggest_int('n_estimators', 50, 500),
        'min_child_weight': trial.suggest_int('min_child_weight', 1, 10),
        'gamma': trial.suggest_loguniform('gamma', 1e-8, 1.0),
        'subsample': trial.suggest_loguniform('subsample', 0.01, 1.0),
        'colsample_bytree': trial.suggest_loguniform('colsample_bytree', 0.01, 1.0),
        'reg_alpha': trial.suggest_loguniform('reg_alpha', 1e-8, 1.0),
        'reg_lambda': trial.suggest_loguniform('reg_lambda', 1e-8, 1.0),
        'eval_metric': 'mlogloss',
        'use_label_encoder': False
    }

    # Fit the model
    optuna_model = XGBClassifier(**params)
    optuna_model.fit(X_train, y_train)

    # Make predictions
    y_pred = optuna_model.predict(X_test)

    # Evaluate predictions
    accuracy = accuracy_score(y_test, y_pred)
    return accuracy

Create the Optuna study

Next we need to define an Optuna study. We’ll set the direction to maximize, as we want to maximize the accuracy score.

study = optuna.create_study(direction='maximize')

[I 2022-09-28 19:54:56,765] A new study created in memory with name: no-name-b8402a5f-29e1-44a1-9fd8-183717d58b3f

Optimize the objective function

Finally, we can run the objective function using the study.optimize function. We’ll set the n_trials parameter to 100, which means that Optuna will run the objective function 100 times, and will try to find the best hyperparameters.

If you’ve ever used GridSearchCV for hyperparameter tuning, you’ll know that it can take a long time to run, especially if you have a large number of hyperparameters to tune. Optuna is much faster, as it uses Bayesian optimization to find the best hyperparameters.

study.optimize(objective, n_trials=100)

Print the best parameters

Depending on the speed of your data science workstation, the hyperparameter tuning should be complete in a minute or so. It will take much longer on a larger dataset, or if you define more hyperparameters to tune.

Now we can print the best parameters, and the best accuracy score achieved during the study trials.

print('Number of finished trials: {}'.format(len(study.trials)))
print('Best trial:')
trial = study.best_trial

print('  Value: {}'.format(trial.value))
print('  Params: ')

for key, value in trial.params.items():
    print('    {}: {}'.format(key, value))

Number of finished trials: 100
Best trial:
  Value: 1.0
  Params: 
    max_depth: 6
    learning_rate: 0.2236810727625855
    n_estimators: 50
    min_child_weight: 5
    gamma: 0.10770487614463455
    subsample: 0.29658342065443705
    colsample_bytree: 0.08778804479025275
    reg_alpha: 0.05958436598353962
    reg_lambda: 0.1439741099392137

Re-fit the model with the best hyperparameters

Now we’ll save teh best parameters to a dictionary, and we’ll use the XGBClassifier function to create a new model with the best parameters.

params = trial.params

model = XGBClassifier(**params)
model.fit(X_train, y_train)

Make predictions

Once the model has been retrained with the best hyperparameters, we can make predictions on the test set and evaluate the model.

y_pred = model.predict(X_test)

Evaluate predictions

We’ll use the same evaluation techniques as before, assessing performance with the accuracy_score function, and the classification_report function. The results show that we now get an accuracy score of 100%, which is perfect.

accuracy = accuracy_score(y_test, y_pred)
print("Accuracy after tuning: %.2f%%" % (accuracy * 100.0))

Accuracy after tuning: 100.00%

print(classification_report(y_test, y_pred))

              precision    recall  f1-score   support

           0       1.00      1.00      1.00        23
           1       1.00      1.00      1.00        19
           2       1.00      1.00      1.00        12

    accuracy                           1.00        54
   macro avg       1.00      1.00      1.00        54
weighted avg       1.00      1.00      1.00        54

Matt Clarke, Tuesday, September 27, 2022

Matt Clarke Matt is an Ecommerce and Marketing Director who uses data science to help in his work. Matt has a Master's degree in Internet Retailing (plus two other Master's degrees in different fields) and specialises in the technical side of ecommerce and marketing.