Back to all posts

Synthetics

Gretel Copyright 2025

Building Synthetic Datasets with Reasoning Traces Using Gretel Navigator

Incorporating reasoning traces into synthetic datasets enhances AI transparency and trustworthiness.
Read more...

2025: The Year Synthetic Data Goes Mainstream

How synthetic data is transforming enterprise AI in 2025 by addressing privacy, fine-tuning, and scaling challenges.
Read more...

Building Datasets to Enable Safer AI Responses

Gretel releases an open synthetic dataset to align language models for safety. Get insights into safety datasets.
Read more...

Gretel and Google Cloud partner on synthetic data for safer generative AI adoption

Gretel partners with Google Cloud to harness the power of synthetic data and accelerate safer generative AI adoption in the enterprise.
Read more...
Copyright © 2022 Gretel.ai

How to Generate Synthetic Data: Tools and Techniques to Create Interchangeable Datasets

Synthetic data is algorithmically generated data that mirrors the statistical properties of the dataset it’s based on. Learn how to make high-quality synthetic data.
Read more...

Sample-to-Dataset: Generate Rich Datasets from Limited Samples Using Data Designer

Seed to succeed: use the sample-to-dataset workflow to create diverse, large-scale synthetic datasets tailored to your needs with nothing but a few samples.
Read more...

Compare Synthetic and Real Data on ML Models with the new Gretel Synthetic Data Utility Report

Use Gretel Evaluate classification and regression tasks to validate synthetic data utility
Read more...

Synthesizing Private Patient Data with Gretel: A Step-by-Step Guide

Create privacy-safe synthetic patient data with Gretel, ensuring compliance, secure sharing, and actionable insights for AI and machine learning in healthcare.
Read more...

Addressing Concerns of Model Collapse from Synthetic Data in AI

How thoughtful, high-quality synthetic data generation, rather than 'indiscriminate' use, can prevent model collapse.
Read more...
Copyright © 2022 Gretel.ai

Conditional Text Generation by Fine-Tuning Gretel GPT

Augment machine learning datasets with synthetically generated text and labels using an open-source implementation of GPT-3.
Read more...

Fine-tune a MPT-7B LLM with Gretel GPT

Learn how to fine-tune and prompt mpt-7b to generate responses matching popular Twitter personalities with Gretel GPT.
Read more...

Fine-Tuning CodeLlama on Gretel's Synthetic Text-to-SQL Dataset using Amazon SageMaker JumpStart

Fine-tune CodeLlama with Gretel's Synthetic Text-to-SQL on BIRDBench, achieving a 36% relative improvement in EX and 38% in VES.
Read more...

Accelerating FinTech Innovation with Natural Language to Code

Train financial LLMs with Gretel's Synthetic Text-to-Python dataset to transform natural language into precise, domain-specific Python code for FinTech.
Read more...

Generate Differentially Private Synthetic Text with Gretel GPT

Safely leverage sensitive or proprietary text data for advanced language model training and fine-tuning
Read more...

Introducing Gretel Tabular DP: A fast, graph-based synthetic data model with strong differential privacy guarantees

Gretel Tabular DP is a fast and powerful new model to generate high quality tabular synthetic data with mathematical guarantees of privacy
Read more...

How to Create Synthetic Data at High Quality for Fine-Tuning LLMs

Gretel Navigator’s synthetic data generation outperformed OpenAI's GPT-4 by 25.6%, surpassed Llama3-70b by 48.1%, and exceeded human expert-curated data by 73.6%.
Read more...
Copyright © 2023 Gretel.ai

Generate Synthetic Databases with Gretel Relational

Introducing Gretel Relational, enabling organizations to generate high-quality synthetic databases while preserving cross-table relationships.
Read more...

Introducing Model Suites for Synthetic Data Generation

A new standard for ensuring regulatory compliance and managing the complexities of compound AI systems.
Read more...

Teaching AI to control computers with Gretel Navigator on Amazon Bedrock

Use Gretel Navigator on Amazon Bedrock to create safe, scalable synthetic data for training AI to understand and execute tool commands.
Read more...

Teaching AI to Think: A New Approach with Synthetic Data and Reflection

Gretel's synthetic GSM8k dataset shows an 84% improvement for AI Reasoning tasks vs synthetic data generated without the Reflection technique.
Read more...

Synthetic Data and the Data-centric Machine Learning Life Cycle

Gretel's synthetic data platform overcomes challenges across the data-centric machine learning life cycle to enable AI and ML solutions.
Read more...
Copyright © 2022 Gretel.ai

Red Teaming Synthetic Data Models

How we implemented a practical attack on a synthetic data model to validate its ability to protect sensitive information under different parameter settings.
Read more...

Fine-tuning Models for Healthcare via Differentially-Private Synthetic Text

How to safely fine-tune LLMs on sensitive medical text for healthcare AI applications using Gretel and Amazon Bedrock
Read more...

An Awesome Synthetic Multilingual Prompts Dataset

Gretel's latest open synthetic dataset aims to enhance LLM interactions and contributes to the popular 'awesome-chatGPT-prompts' GitHub repository.
Read more...

Introducing world's largest synthetic open-source Text-to-SQL dataset

Gretel releases largest open source Text-to-SQL dataset to accelerate AI model training
Read more...

GSM-Symbolic: Analyzing LLM Limitations in Mathematical Reasoning and Potential Solutions

What The Recent Paper on LLM Reasoning Got Right—And What It Missed.
Read more...

Machine Learning Accuracy Using Synthetic Data

Can synthetic data really be used in machine learning? We explore the utility of synthetic data created from popular datasets and tested on popular ML algorithms.
Read more...
Copyright © 2022 Gretel.ai

Test Data Generation: Uses, Benefits, and Tips

Test data generation is the process of creating new data that replicates an original dataset. Here’s how developers and data engineers use it.
Read more...

Transforms and Synthetics on Relational Databases

A walkthrough of our new multi-table transform and multi-table synthetics notebooks, which can be used independently or simultaneously.
Read more...
Copyright © 2022 Gretel.ai

What Is Data Simulation?

Data simulation is the process of using large quantities of data to predict events and validate models. Get the full data simulation definition.
Read more...

Gretel announces partnership with Databricks to improve Enterprise AI performance

Gretel partners with Databricks to seamlessly integrate synthetic data workflows and improve model performance for Enterprise AI.
Read more...

Introducing Gretel MLOps

Use Gretel's synthetic data platform to replace, augment, or balance training datasets within MLOps pipelines like Vertex AI, Azure ML, and Amazon SageMaker.
Read more...

Gretel partners with Google Cloud to develop native synthetic data integration, achieves BigQuery designation 

‍Gretel today announced that it has successfully achieved Google Cloud Ready - BigQuery designation.
Read more...

How to Generate Best-in-Class Synthetic Time Series Data

Use Gretel DGAN and Gretel Tuner to generate time series data that accurately mirrors complex business rules and sequences
Read more...

Introducing Gretel's Transform v2

Leverage Gretel’s New Ultra-Fast and Fully Flexible De-Identification and Rule-Based Transformation Solution for HIPAA Compliance.
Read more...

Filling in sparse tables with Gretel Navigator

How to automatically generate missing tabular data that maintains contextual relevance.
Read more...

Gretel Demo Day: Exploring the Future of Synthetic Data

Celebrating Gretel's latest innovations by diving into the future of multimodal synthetic data, and our Model Playground and Tabular LLM.
Read more...

Optimize the Llama-2 Model with Gretel’s Text SQS

How Gretel's data quality analysis tools for evaluating generated text can help you optimize the performance LLMs, like the Llama-2 model.
Read more...

Prompting Llama-2 at Scale with Gretel

Discover how to efficiently use Gretel's platform for prompting Llama-2 on large datasets, whether you're completing answers, generating synthetic text, or labeling.
Read more...

Automate Synthetic Data Pipelines with Gretel Workflows

Gretel Workflows orchestrate synthetic data generation, ensuring users have accurate, up-to-date data for software development, analytics, and ML/AI.
Read more...
Copyright (©) 2023 Gretel

Gretel GPT Sentiment Swap

Let’s fine tune and prompt a large language model to swap the sentiment of product reviews!
Read more...

Gretel is live on Google Cloud Marketplace 🎉

Gretel’s suite of privacy-enhancing tools and generative AI models are now available on Google Cloud Marketplace.
Read more...

Gretel is now available in the AWS Marketplace

Announcing the availability of Gretel’s high quality synthetic data generation tools in the AWS Marketplace.
Read more...

Unlocking Adapted LLMs on Enterprise Data

Gretel GPT supports new, state-of-the-art LLMs, and makes it easier for you to trust the privacy and accuracy of LLMs for enterprise use-cases.
Read more...

Scale Synthetic Data to Millions of Rows with ACTGAN

Discover how Gretel ACTGAN can help businesses generate synthetic data at scale with improved accuracy, faster training, and reduced memory requirements.
Read more...

Bringing AI-generated images to enterprise use cases

Gretel's new image synthetics enable you to generate high-quality images at scale. Get started today with our free public preview and let us know what you think!
Read more...

Anonymize tabular data to meet GDPR privacy requirements

Learn how to anonymize tabular data to meet GDPR standards using Gretel's synthetic data APIs.
Read more...

Synthetic Image Models for Smart Agriculture

Learn how synthetic image models can address data drift and improve a computer vision model's accuracy in unexpected conditions.
Read more...
Copyright (c) 2022 Gretel.ai

Generate synthetic Taylor Swift-like lyrics using Gretel GPT

Celebrate Taylor Swift's The Eras Tour with lyrics generated with Gretel GPT, our synthetic language model.
Read more...

Introducing Gretel Benchmark

Benchmark is your toolkit to evaluate any synthetic data algorithm on any production dataset
Read more...

Conditional data generation in 4 lines of code

Augment or balance your ML datasets in minutes with state-of-the-art generative models.
Read more...

Introducing Gretel Amplify

Generate large volumes of tabular synthetic data at high speed.
Read more...

Generate synthetic data in 3 lines of code

Learn the simplest way to generate synthetic data without setting up your own infrastructure and GPUs.
Read more...

Practical Privacy with Synthetic Data

Implementing a practical attack to measure un-intended memorization in synthetic data models.
Read more...

ML Models: Understanding the Fundamentals

Machine learning models can be trained to recognize patterns in datasets. By utilizing algorithms, they can learn to make decisions based on these patterns.
Read more...
Copyright © 2022 Gretel.ai

Create Synthetic Time-series Data with DoppelGANger and PyTorch

Generate synthetic time series data with Gretel.ai’s open-source PyTorch implementation of DoppelGANger.
Read more...
Copyright © 2022 Gretel.ai

Diffusion models for document synthesis

Explore state-of-the-art image synthetics for business documents using diffusion models.
Read more...

How to safely work with another company's data

Data sharing is central to modern business but entails risks. Synthetic data can enable data sharing while reducing the risk of privacy-compromising linkage attacks.
Read more...

How accurate is my synthetic data?

Gretel’s new synthetic report is here, featuring a high-level score and metrics to help you assess the quality of your synthetic data.
Read more...

Create artificial data with Gretel Synthetics and Google Colaboratory

Use Gretel Synthetics and Colaboratory’s free GPUs to train a model to automatically generate fake, anonymized data with differential privacy guarantees.
Read more...

Q&A Series: Solving Privacy Problems with Synthetic Data

Answers to some questions about synthetic data that audience members submitted during Gretel's talk at The Rise of Privacy Tech’s Data Privacy Week 2022 conference.
Read more...

Creating synthetic time series data

A step-by-step guide to creating high quality synthetic time-series datasets with Python.
Read more...

Gretel's New Synthetic Performance Report

Gretel's Premium SDK now includes detailed reporting that shows you how accurate your synthetic data's statistical distributions and correlations are.
Read more...

Improving massively imbalanced datasets in machine learning with synthetic data

Use synthetic data to improve model accuracy for fraud, cyber security, or any classification task with an extremely limited minority class.
Read more...
Copyright 2021 Gretel

Introducing Gretel's Privacy Filters

Create synthetic data that’s safer than ever. Our simple configuration file settings enable you to secure both your data and model from adversarial attacks.
Read more...
Copyright (c) 2021 Gretel

Optuna Your Model Hyperparameters

We explore the popular open-source package Optuna to demonstrate how you can optimize your model hyperparameters and build the best synthetic model possible.
Read more...
Copyright (c) 2021 Gretel.ai

Why Nonprofits Should Care About Synthetic Data

How synthetic data can help nonprofits improve their business operations and their impact on the people they serve.
Read more...
Copyright © 2022 Gretel.ai

The Evolution of Gretel's Developer Stack for Synthetic Data

Some of our newest product and technology initiatives that will ensure the Gretel platform continues to grow and evolve with the needs of modern data consumers.
Read more...

Synthetic Data Configuration Templates

Our new configuration templates will help you pick some of the right parameters needed to train your synthetic data models.
Read more...

How To Create Differentially Private Synthetic Data

A practical guide to creating differentially private, synthetic data with Python and TensorFlow.
Read more...

Gretel Synthetics: Introducing v0.10.0

Explore how to create a batch interface with the latest version of Gretel Synthetics on Google Colaboratory.
Read more...

Automatically Reducing AI Bias With Synthetic Data

Create a fair, balanced, privacy preserving version of the 1994 US Census dataset using gretel-synthetics.
Read more...

Build a synthetic data pipeline using Gretel and Apache Airflow

In this blog post, we build an ETL pipeline that generates synthetic data from a PostgreSQL database using Gretel’s Synthetic Data APIs and Apache Airflow.
Read more...

Evaluating Data Sampling Methods with a Synthetic Quality Score

An evaluation of the effect of sampling procedures on the quality of synthetic tabular data using Gretel.ai's Synthetic Quality Score (SQS).
Read more...
Copyright © 2022 Gretel.ai

Measure the Quality of any Synthetic Dataset with Gretel Evaluate

Assessing the efficacy and quality of synthetic data with Gretel Evaluate API.
Read more...

Deep dive on generating synthetic data for Healthcare

Take a deep dive on training Gretel’s open-source, synthetic data library to generate electronic health records that protect individual privacy (PII).
Read more...

How to use Weights & Biases with Gretel.ai

How to use Weights & Biases’ ML hyperparameter sweeps tool to optimize the accuracy of your synthetic data.
Read more...

Walkthrough: Create Synthetic Data from any DataFrame or CSV

Train an AI model to create an anonymized version of your dataset using Python, Pandas, and gretel-synthetics.
Read more...
Copyright (c) 2021 Gretel.ai

Advanced Data Privacy: Gretel Privacy Filters and ML Accuracy

A look at how using Gretel’s Privacy Filters to immunize synthetic datasets against adversarial attacks can impact machine learning accuracy.
Read more...
Copyright © 2021 Gretel Labs. All rights reserved.

Synthetic Time Series Data Creation for Finance

How we generated high-quality synthetic time-series data for one of the largest financial institutions in the world.
Read more...