Optuna Your Model Hyperparameters

We explore the popular open-source package Optuna to demonstrate how you can optimize your model hyperparameters and build the best synthetic model possible.
Copyright (c) 2021 Gretel
Copyright (c) 2021 Gretel

In machine learning, hyperparameter tuning is the effort of finding the optimal set of hyperparameter values for your model before the learning process begins. Optuna is an awesome open-source framework for automating this task. There are many other popular open-source hyperparameter optimization frameworks out there such as Hyperopt, Scikit-Optimize and Ray-Tune. What I love most about Optuna, though, is its overall ease of use and specifically its easy parallelization. In this article, we walk you through how to tune multiple models and run multiple Optuna trials all in parallel and from the same Jupyter notebook. You can follow along with Gretel.ai’s notebook located here.

Optuna Functionality

Optuna uses the notion of studies and trials.  A study contains all the information surrounding the tuning of a specific model. A trial is a specific model training task trying out a specific set of model hyperparameters. Each trial calls the same objective function, which specifies the hyperparameters and their ranges that you’d like to tune and returns a value (such as model accuracy) that you’d like to optimize.

In our example notebook, we tuned multiple Gretel.ai synthetic models and optimized each using our Synthetic Quality Score (SQS). We then told Optuna to use an SQLite database (pre-packaged with most operating systems) to enable the running of multiple Optuna trials in parallel. Next, we placed the code to instantiate an Optuna trial in a Python module, which then enabled the Jupyter notebook to execute the module as many times in parallel as we’d  like.

Below is the contents of our objective function:

import time

import optuna

from gretel_client import projects
from gretel_client import create_project
from gretel_client.config import RunnerMode
from gretel_client.helpers import poll

def objective(trial: optuna.Trial):
 
    # Set which hyperparameters you want to tune

    config['models'][0]['synthetics']['params']['rnn_units'] = trial.suggest_int(name="rnn_units", low=64, high=1024, step=64)
    config['models'][0]['synthetics']['params']['dropout_rate'] = trial.suggest_float("dropout_rate", .1, .75)
    config['models'][0]['synthetics']['params']['gen_temp'] = trial.suggest_float("gen_temp", .8, 1.2)
    config['models'][0]['synthetics']['params']['learning_rate'] = trial.suggest_float("learning_rate",  .0005, 0.01, step=.0005)
    config['models'][0]['synthetics']['params']['reset_states'] = trial.suggest_categorical(
        "reset_states", choices=[True, False])
    config['models'][0]['synthetics']['params']['vocab_size'] = trial.suggest_categorical(
        "vocab_size", choices=[0, 5000, 10000, 15000, 20000, 25000, 30000, 35000, 40000, 45000, 50000])

    # Create a new Gretel project
    
    seconds = int(time.time())
    project_name = "Tuning Experiment" + str(seconds)
    project = create_project(display_name=project_name)
    
    # Create a Gretel synthetic model 
    
    model = project.create_model_obj(model_config=config)
    model.data_source = dataset
    model.submit(upload_data_source=True)
        
    # Watch for model completion
                
    status = "active"
    sqs = 0

    while ((status == "active") or (status == "pending")):
        #Sleep a bit here
        time.sleep(60)
        model._poll_job_endpoint()
        status = model.__dict__['_data']['model']['status']
        if status == "completed":
            report = model.peek_report()
            if report:
                sqs = report['synthetic_data_quality_score']['score']
            else:
                sqs = 0
        elif status == "error":
            sqs = 0
           
    project.delete()
    
    # Return the model Synthetic Quality Score
    
    return sqs

Note how at the beginning of the objective function we use the trial.suggest* functions to specify which parameters we’d like to tune and over what values. The trial.suggest_int and trial.suggest_float will let you specify a range and then how you’d like to explore that range.  The trial.suggest_categorical will let you specify a list of categorical values for Optuna to try. You can read more about these and other Optuna options here. After specifying which hyperparameters you’d like to tune, our objective function then trains a synthetic model, polls for completion and then returns our SQS score. This function exists in our external Python module, so we can run it multiple times.

In our main Jupyter notebook, we start by reading in the default synthetic configuration file, which contains all the default hyperparameter values needed to train a model:

from smart_open import open
import yaml

with open("https://raw.githubusercontent.com/gretelai/gretel-blueprints/main/config_templates/gretel/synthetics/default.yml", 'r') as stream:
    config = yaml.safe_load(stream)

We then create a function to instantiate an Optuna study:

import subprocess

def create_study(study_name, dataset, trial_job_cnt, trials_per_job, api_key, storage, sampler):
    
    study = optuna.create_study(study_name=study_name,storage=storage, sampler=sampler, direction="maximize")
    
    # Tell Optuna to start with our default config settings

    study.enqueue_trial(
        {
        "vocab_size": config['models'][0]['synthetics']['params']['vocab_size'],
        "reset_states": config['models'][0]['synthetics']['params']['reset_states'],
        "rnn_units": config['models'][0]['synthetics']['params']['rnn_units'],
        "learning_rate": config['models'][0]['synthetics']['params']['learning_rate'],
        "gen_temp": config['models'][0]['synthetics']['params']['gen_temp'],
        "dropout_rate": config['models'][0]['synthetics']['params']['dropout_rate'],
        }
    )
    
    # We will run a total of "trial_cnt" trials with "trial_job_cnt" number of processes running in parallel
    
    trial_cnt = str(trials_per_job)
    for i in range(trial_job_cnt):
        mytrial = subprocess.Popen(["python", "Optuna_Trials.py", study_name, trial_cnt, dataset, api_key, storage])
    
    return study

The very first thing we do in the above function is instantiate a new Optuna study using “optuna.create_study”. This function takes as input the study name you’d like to use, a link to your SQLite database and whether you’d like to maximize the output of your objective function or minimize it. You don’t need to worry about creating the SQLite database, if it’s not there Optuna will create it. The function also optionally takes in a specification of which Optuna optimizer you’d like to use. If you’re just starting out, a good rule of thumb is to use the Random optimizer if you have a lot of computing resources, otherwise, the default TPE optimizer is a good option. You can read more about the optimizers available here.

Next, we use Optuna’s “study.enqueue_trial” method to tell Optuna to try the hyperparameters in our default configuration first. Oftentimes I’ll also queue up other configurations I’d like Optuna to specifically try as well. You can view Gretel’s recommended configurations for different dataset characteristics here. We then use “subprocess.Popen” to execute the Python module that will begin the Optuna trials. We repeat this for the number of trials we want to run in parallel, each time telling it how many trials each process should execute.

The main processing flow in our Jupyter notebook is then to read in the filenames of the datasets we’d like to tune models on, then call our above “create_study” function for each dataset:

import pandas as pd

datasets = pd.read_csv(dataset_list)

studies = []

for i in range(len(datasets)):
    dataset = datasets.loc[i]["filename"]
    study_name = study_base_name + str(i)
    studies.append(create_study(study_name, dataset, trial_job_cnt, trials_per_job, api_key, storage, sampler))

We can then watch for study completion using this snippet of code:

for i in range(len(studies)):
    study = studies[i]
    print("Study " + str(i))
    for j in range(len(study.trials)):
        state = str(study.trials[j].state)[11:]
        if state == "COMPLETE":
            sqs = study.trials[j].values[0]
            print("\tTrial " + str(j) + " has state " + state + " and SQS " + str(sqs))
        else:
            print("\tTrial " + str(j) + " has state " + state)

The Jupyter notebook then contains some handy code for gathering all the Optuna results into one dataframe if you’d like to study the tuning history in-depth. You can refer to our notebook to see how this works.

Optuna Visualization

Optuna also contains a plethora of visualization routines to both monitor the progression of your tuning as well as assess the results upon completion. I like to use the “plot_optimization_history(study)” command to watch the optimization history of a study while waiting for it to complete.

Optimization History Plot

Upon completion, I usually first use the “plot_param_importances(study)” command to study which hyperparameters were most influential in the tuning.

Hyperparameter Importances

Another insightful graph is the parallel coordinates one using “plot_parallel_coordinate(study)”.

Parallel Coordinate Plot

You can also use the “plot_slice(study)” command to view how each specific hyperparameter homed in on an optimal value, and the “plot_contour(study)” command to look at the relationship between hyperparameter values.

Handy Optuna Commands

There are a variety of handy Optuna commands I like to use either while a study is progressing or once it’s complete. Here are a few of my favorites:

Handy Optuna Commands

Conclusion

Tuning a model’s hyperparameters can be a challenging and time-consuming process, yet often necessary if you want to build the most accurate model possible. Optuna is an easy to use and very effective framework for expediting this process. Here at Gretel, we have a series of preset hyperparameter configurations tuned for specific dataset characteristics. If you have an unusual dataset that doesn’t quite fit one of these configurations, or you’d like to make sure you’re building the best synthetic model possible, go ahead and give Optuna a try. It’s simple, fast, and very effective.

Please feel free to reach out to me at Gretel.ai with any questions.  Thank you for reading!