Introducing Gretel Benchmark

Benchmark is your toolkit to evaluate any synthetic data algorithm on any production dataset

What is Gretel Benchmark?

Today we’re announcing the release of Gretel Benchmark, a Python library for you to compare any model that generates synthetic data, with a set of standardized tests to evaluate those algorithms for synthetic data quality, runtime, and other machine learning use cases. 

Getting started 

Want to jump right in? You can use Gretel Benchmark by installing Gretel-trainer.

Get started with the quickstart Benchmark notebook and Benchmark documentation

Keep reading to learn about important features and the detailed Benchmark evaluation report.

Models

Your custom models

It’s easy to define custom models, so you can use any algorithm, not just Gretel models, for synthetic data generation to compare in Benchmark. 

To provide your own model implementation, define a Python class that meets this interface:

class MyCustomModel:
    def train(self, source: str, **kwargs) -> None:
        # your training code here
    def generate(self, **kwargs) -> pd.DataFrame:
        # your generation code here

Learn more here about creating your custom model. Make sure to install any third-party libraries you use as dependencies wherever you are running Benchmark.

Gretel models

We’ve also made it easy for you to use Gretel models in Benchmark. Here’s a nifty summary of all the available Gretel models with default configurations:

GretelAuto

  • This model will automatically pick the best solution between GretelLSTM and GretelCTGAN for your dataset (see more below on the two models). This can be helpful if you want the Gretel engine to select the best model based on the characteristics of your dataset.

GretelLSTM 

  • This model works for a variety of synthetic data tasks and works with time-series, tabular, and text data. GretelLSTM is generally most useful for datasets with a few thousand records and upward. Datasets can include a mix of categorical, continuous, and numerical values. 

GretelCTGAN

  • This model works well for high-dimensional, largely numeric data. You can use GretelCTGAN for datasets with more than 20 columns and/or 50,000 rows.
  • Data requirements: Not ideal if the dataset contains free text fields.

GretelGPT 

  • This model is useful for natural language or plain text datasets such as reviews, tweets, and conversations. 
  • Data requirements: Dataset must be single-column.

GretelAmplify

  • This model is great for generating lots of data quickly. 
  • Note: GretelAmplify is not a neural network model but instead uses statistical means to generate large amounts of data from an input dataset. The Synthetic Data Quality Score (SQS) for data generated using GretelAmplify may be lower.

You can also easily modify the Gretel model configs with: 

class CustomizedLSTM(GretelModel):
    config = {...} # define configuration here

Find out more about how to use Gretel model classes in the Benchmark documentation. 

Data

Benchmark allows you to compare the synthetic data quality and runtime of multiple models (whether custom or Gretel models) on multiple datasets. 

To use your own data in Benchmark, you can follow the instructions for  `make_dataset` in the docs or check out the Benchmark notebook

If you need test data, we also provide a list of publicly available datasets that are popular for synthetic data use cases like those in finance, e-commerce, healthcare, and more. You can view and select datasets in Benchmark using these functions

list_gretel_datasets(datatype: Optional[Union[Datatype, str]] = None, tags: Optional[List[str]] = None) -> List[Dataset]
"""Returns a list of Gretel-curated datasets matching the specified datatype and tags. Uses “and” semantics—i.e. only returns datasets that match all supplied values.
`datatype` (optional): Datatype to filter on
`tags` (optional): Tags to filter on. Various tags are applied to Gretel-curated datasets, see below"""

get_gretel_dataset(name: str) -> Dataset
"""Fetches a Gretel-curated dataset from Gretel’s S3 bucket
`name` (required): The name of the dataset.
This function will raise an exception if no dataset exists with the supplied name"""

Evaluations

We created a set of standard tests that make up the Gretel Benchmark evaluations on algorithms for synthetic data generation. The Benchmark report shows: 

  • Data type 
  • Data shape 
  • Synthetic Data Quality Score (SQS): an evaluation, developed by Gretel, of the quality of synthetic data. Learn more about SQS here
  • Train time 
  • Generate time 
  • Total runtime 

The Benchmark report

Want to evaluate Gretel models on your industry use case? For a quick and easy look into how different Gretel models perform on popular machine learning datasets, check out our Benchmark report below. When you run Benchmark, you’ll also see an evaluation report like this one.

You can use a Benchmark report like the one shown here to evaluate which Gretel model is best for your synthetic data goals. For example, Gretel LSTM consistently generates synthetic data with a high Synthetic Data Quality Score (SQS) on multiple types of tabular or complex data. As seen in the results below, Gretel CTGAN is great for particularly long or wide datasets and generally has a faster runtime. If you’re looking to quickly generate lots of data, Gretel Amplify produces results in 1/10 of the time (check out the fast train and generate times!). Gretel GPT-X generates high-quality synthetic data for natural language datasets. Depending on your specific goals with synthetic data or constraints, you may find particular Gretel models to be best suited for your use case. You can reference the Benchmark report below to guide how you evaluate Gretel models, or of course, try Benchmark yourself!  

IndustryInput dataModelDataTypeRowsColsSQSTrain time (sec)Generate time (sec)Total time (sec)
Ads, Finance, Marketingbank_marketing_large/data.csvGretelAmplifytabular_mixed41188217336.0729.7565.82
bank_marketing_large/data.csvGretelCTGANtabular_mixed4118821851300.7533.241333.99
bank_marketing_large/data.csvGretelLSTMtabular_mixed411882184317.79401.04718.83
bank_marketing_small/data.csvGretelAmplifytabular_mixed4521178024.4123.8148.22
bank_marketing_small/data.csvGretelCTGANtabular_mixed45211784169.32175.63344.95
bank_marketing_small/data.csvGretelLSTMtabular_mixed45211784326.2696.73422.99
dow_jones_index/data.csvGretelAmplifytime_series750167681.523.32104.82
dow_jones_index/data.csvGretelCTGANtime_series7501670221.58129.15350.73
dow_jones_index/data.csvGretelLSTMtime_series7501683424.264.66488.86
banking77/data.csvGretelAmplifynatural_language10016110084.4258.36142.78
banking77/data.csvGretelLSTMnatural_language100161100318.1996.17414.36
banking77/data.csvGretelGPTXnatural_language100161100487.563675487.56
E-commercebike_sales/data.csvGretelAmplifytabular_numeric165192479119.9430149.94
bike_sales/data.csvGretelLSTMtabular_numeric165192488911.59249.681161.27
car_evaluation/data.csvGretelAmplifytabular_numeric172878524.0623.747.76
car_evaluation/data.csvGretelCTGANtabular_numeric1728777201.1944.5245.69
car_evaluation/data.csvGretelLSTMtabular_numeric1728787357.6654.02411.68
credit_card_payments/data.csvGretelAmplifytabular_mixed300002574107.1329.97137.1
credit_card_payments/data.csvGretelCTGANtabular_mixed300002583122933.41262.4
credit_card_payments/data.csvGretelLSTMtabular_mixed3000025811468.11579.922048.03
olist_order_payments/data.csvGretelAmplifytabular_numeric103886569529.0640.41569.47
olist_order_payments/data.csvGretelLSTMtabular_numeric1038865934201.89897.225099.11
Employmentdata_science_job_candidates/data.csvGretelAmplifytabular_mixed191581488107.6623.26130.92
data_science_job_candidates/data.csvGretelCTGANtabular_mixed191581490609.02128.11737.13
data_science_job_candidates/data.csvGretelLSTMtabular_mixed191581493358.21276.29634.5
ibm_employee_attrition/data.csvGretelAmplifytabular_mixed1470378824.0920.544.59
ibm_employee_attrition/data.csvGretelCTGANtabular_mixed14703780368.1333.93402.06
ibm_employee_attrition/data.csvGretelLSTMtabular_mixed14703793365.18127.79492.97
Energy, Telecomenergydata_complete/data.csvGretelAmplifytime_series197352974103.6833.09136.77
energydata_complete/data.csvGretelLSTMtime_series1973529931531.88400.291932.17
telco_customer_churn/data.csvGretelAmplifytabular_mixed7043338240.8430.1771.01
telco_customer_churn/data.csvGretelCTGANtabular_mixed704333795911.3455.365966.7
telco_customer_churn/data.csvGretelLSTMtabular_mixed70433376787.7155.55943.25
Environment, Foodair_quality_uci/data.csvGretelAmplifytime_series9357156596.1552.47148.62
air_quality_uci/data.csvGretelCTGANtime_series935715626656.1254.496710.61
air_quality_uci/data.csvGretelLSTMtime_series93571589398.42211.66610.08
winequality_red/data.csvGretelAmplifytabular_numeric1599128281.9423.68105.62
winequality_red/data.csvGretelCTGANtabular_numeric1599126176.1443.87120.01
winequality_red/data.csvGretelLSTMtabular_numeric15991289221.9254.69276.61
winequality_white/data.csvGretelAmplifytabular_numeric4898128824.6523.2747.92
winequality_white/data.csvGretelCTGANtabular_numeric48981281139.0333.15172.18
winequality_white/data.csvGretelLSTMtabular_numeric48981291287.8476.4364.24
Governmentportuguese_election_data/data.csvGretelAmplifytabular_numeric21643285231.33107.04138.37
portuguese_election_data/data.csvGretelCTGANtabular_numeric216432872928.15128.731056.88
portuguese_election_data/data.csvGretelLSTMtabular_numeric216432881455.56327.19782.75
adult/data.csvGretelAmplifytabular_mixed325611585213.5458.46272
adult/data.csvGretelCTGANtabular_mixed325611587965.31128.031093.34
adult/data.csvGretelLSTMtabular_mixed325611594667.21615.081282.29
Healthcareprocessed_cleveland_heart_disease_uci/data.csvGretelAmplifytabular_numeric303148335.8723.0158.88
processed_cleveland_heart_disease_uci/data.csvGretelCTGANtabular_numeric303147066.9733.48100.45
processed_cleveland_heart_disease_uci/data.csvGretelLSTMtabular_numeric3031491221.5654.26275.82
breast_cancer_wisconsin/data.csvGretelAmplifytabular_numeric699115523.8923.5547.44
breast_cancer_wisconsin/data.csvGretelCTGANtabular_numeric699115667.4203.11270.51
breast_cancer_wisconsin/data.csvGretelLSTMtabular_numeric6991183206.8864.73271.61

Learn more

You can find out more in the Benchmark documentation. Questions or comments? We’re always available in our Discord community - send us a note! Happy synthesizing!