Synthetics

Building Synthetic Datasets with Reasoning Traces Using Gretel Navigator
Incorporating reasoning traces into synthetic datasets enhances AI transparency and trustworthiness.
Read more...
2025: The Year Synthetic Data Goes Mainstream
How synthetic data is transforming enterprise AI in 2025 by addressing privacy, fine-tuning, and scaling challenges.
Read more...
Building Datasets to Enable Safer AI Responses
Gretel releases an open synthetic dataset to align language models for safety. Get insights into safety datasets.
Read more...
Gretel and Google Cloud partner on synthetic data for safer generative AI adoption
Gretel partners with Google Cloud to harness the power of synthetic data and accelerate safer generative AI adoption in the enterprise.
Read more...
How to Generate Synthetic Data: Tools and Techniques to Create Interchangeable Datasets
Synthetic data is algorithmically generated data that mirrors the statistical properties of the dataset it’s based on. Learn how to make high-quality synthetic data.
Read more...
Sample-to-Dataset: Generate Rich Datasets from Limited Samples Using Data Designer
Seed to succeed: use the sample-to-dataset workflow to create diverse, large-scale synthetic datasets tailored to your needs with nothing but a few samples.
Read more...
Generate time-series data with Gretel’s new DGAN model
Announcing the open beta release of our DGAN model type.
Read more...
Compare Synthetic and Real Data on ML Models with the new Gretel Synthetic Data Utility Report
Use Gretel Evaluate classification and regression tasks to validate synthetic data utility
Read more...
Synthesizing Private Patient Data with Gretel: A Step-by-Step Guide
Create privacy-safe synthetic patient data with Gretel, ensuring compliance, secure sharing, and actionable insights for AI and machine learning in healthcare.
Read more...%20(6).png)
Addressing Concerns of Model Collapse from Synthetic Data in AI
How thoughtful, high-quality synthetic data generation, rather than 'indiscriminate' use, can prevent model collapse.
Read more...
Conditional Text Generation by Fine-Tuning Gretel GPT
Augment machine learning datasets with synthetically generated text and labels using an open-source implementation of GPT-3.
Read more...
Fine-tune a MPT-7B LLM with Gretel GPT
Learn how to fine-tune and prompt mpt-7b to generate responses matching popular Twitter personalities with Gretel GPT.
Read more...
Fine-Tuning CodeLlama on Gretel's Synthetic Text-to-SQL Dataset using Amazon SageMaker JumpStart
Fine-tune CodeLlama with Gretel's Synthetic Text-to-SQL on BIRDBench, achieving a 36% relative improvement in EX and 38% in VES.
Read more...
Accelerating FinTech Innovation with Natural Language to Code
Train financial LLMs with Gretel's Synthetic Text-to-Python dataset to transform natural language into precise, domain-specific Python code for FinTech.
Read more....png)
Generate Differentially Private Synthetic Text with Gretel GPT
Safely leverage sensitive or proprietary text data for advanced language model training and fine-tuning
Read more...
Introducing Gretel Tabular DP: A fast, graph-based synthetic data model with strong differential privacy guarantees
Gretel Tabular DP is a fast and powerful new model to generate high quality tabular synthetic data with mathematical guarantees of privacy
Read more....png)
How to Create Synthetic Data at High Quality for Fine-Tuning LLMs
Gretel Navigator’s synthetic data generation outperformed OpenAI's GPT-4 by 25.6%, surpassed Llama3-70b by 48.1%, and exceeded human expert-curated data by 73.6%.
Read more...
Generate Synthetic Databases with Gretel Relational
Introducing Gretel Relational, enabling organizations to generate high-quality synthetic databases while preserving cross-table relationships.
Read more...
Introducing Model Suites for Synthetic Data Generation
A new standard for ensuring regulatory compliance and managing the complexities of compound AI systems.
Read more....png)
Teaching AI to control computers with Gretel Navigator on Amazon Bedrock
Use Gretel Navigator on Amazon Bedrock to create safe, scalable synthetic data for training AI to understand and execute tool commands.
Read more....png)
Teaching AI to Think: A New Approach with Synthetic Data and Reflection
Gretel's synthetic GSM8k dataset shows an 84% improvement for AI Reasoning tasks vs synthetic data generated without the Reflection technique.
Read more...
Synthetic Data and the Data-centric Machine Learning Life Cycle
Gretel's synthetic data platform overcomes challenges across the data-centric machine learning life cycle to enable AI and ML solutions.
Read more...
Red Teaming Synthetic Data Models
How we implemented a practical attack on a synthetic data model to validate its ability to protect sensitive information under different parameter settings.
Read more....png)
Fine-tuning Models for Healthcare via Differentially-Private Synthetic Text
How to safely fine-tune LLMs on sensitive medical text for healthcare AI applications using Gretel and Amazon Bedrock
Read more...
An Awesome Synthetic Multilingual Prompts Dataset
Gretel's latest open synthetic dataset aims to enhance LLM interactions and contributes to the popular 'awesome-chatGPT-prompts' GitHub repository.
Read more...
Introducing world's largest synthetic open-source Text-to-SQL dataset
Gretel releases largest open source Text-to-SQL dataset to accelerate AI model training
Read more...%20(7).png)
GSM-Symbolic: Analyzing LLM Limitations in Mathematical Reasoning and Potential Solutions
What The Recent Paper on LLM Reasoning Got Right—And What It Missed.
Read more...
Machine Learning Accuracy Using Synthetic Data
Can synthetic data really be used in machine learning? We explore the utility of synthetic data created from popular datasets and tested on popular ML algorithms.
Read more...
Test Data Generation: Uses, Benefits, and Tips
Test data generation is the process of creating new data that replicates an original dataset. Here’s how developers and data engineers use it.
Read more...
Transforms and Synthetics on Relational Databases
A walkthrough of our new multi-table transform and multi-table synthetics notebooks, which can be used independently or simultaneously.
Read more...
What Is Data Simulation?
Data simulation is the process of using large quantities of data to predict events and validate models. Get the full data simulation definition.
Read more...
Gretel announces partnership with Databricks to improve Enterprise AI performance
Gretel partners with Databricks to seamlessly integrate synthetic data workflows and improve model performance for Enterprise AI.
Read more....png)
Introducing Gretel MLOps
Use Gretel's synthetic data platform to replace, augment, or balance training datasets within MLOps pipelines like Vertex AI, Azure ML, and Amazon SageMaker.
Read more...
Gretel partners with Google Cloud to develop native synthetic data integration, achieves BigQuery designationÂ
‍Gretel today announced that it has successfully achieved Google Cloud Ready - BigQuery designation.
Read more...
How to Generate Best-in-Class Synthetic Time Series Data
Use Gretel DGAN and Gretel Tuner to generate time series data that accurately mirrors complex business rules and sequences
Read more...
Introducing Gretel's Transform v2
Leverage Gretel’s New Ultra-Fast and Fully Flexible De-Identification and Rule-Based Transformation Solution for HIPAA Compliance.
Read more...
Filling in sparse tables with Gretel Navigator
How to automatically generate missing tabular data that maintains contextual relevance.
Read more...
Gretel Demo Day: Exploring the Future of Synthetic Data
Celebrating Gretel's latest innovations by diving into the future of multimodal synthetic data, and our Model Playground and Tabular LLM.
Read more...
Optimize the Llama-2 Model with Gretel’s Text SQS
How Gretel's data quality analysis tools for evaluating generated text can help you optimize the performance LLMs, like the Llama-2 model.
Read more...
Prompting Llama-2 at Scale with Gretel
Discover how to efficiently use Gretel's platform for prompting Llama-2 on large datasets, whether you're completing answers, generating synthetic text, or labeling.
Read more...
Automate Synthetic Data Pipelines with Gretel Workflows
Gretel Workflows orchestrate synthetic data generation, ensuring users have accurate, up-to-date data for software development, analytics, and ML/AI.
Read more...
Gretel GPT Sentiment Swap
Let’s fine tune and prompt a large language model to swap the sentiment of product reviews!
Read more...
Gretel is live on Google Cloud Marketplace 🎉
Gretel’s suite of privacy-enhancing tools and generative AI models are now available on Google Cloud Marketplace.
Read more....png)
Gretel is now available in the AWS Marketplace
Announcing the availability of Gretel’s high quality synthetic data generation tools in the AWS Marketplace.
Read more...
Unlocking Adapted LLMs on Enterprise Data
Gretel GPT supports new, state-of-the-art LLMs, and makes it easier for you to trust the privacy and accuracy of LLMs for enterprise use-cases.
Read more...
Scale Synthetic Data to Millions of Rows with ACTGAN
Discover how Gretel ACTGAN can help businesses generate synthetic data at scale with improved accuracy, faster training, and reduced memory requirements.
Read more...
Bringing AI-generated images to enterprise use cases
Gretel's new image synthetics enable you to generate high-quality images at scale. Get started today with our free public preview and let us know what you think!
Read more...
Anonymize tabular data to meet GDPR privacy requirements
Learn how to anonymize tabular data to meet GDPR standards using Gretel's synthetic data APIs.
Read more...
Synthetic Image Models for Smart Agriculture
Learn how synthetic image models can address data drift and improve a computer vision model's accuracy in unexpected conditions.
Read more...
Generate synthetic Taylor Swift-like lyrics using Gretel GPT
Celebrate Taylor Swift's The Eras Tour with lyrics generated with Gretel GPT, our synthetic language model.
Read more...
Introducing Gretel Benchmark
Benchmark is your toolkit to evaluate any synthetic data algorithm on any production dataset
Read more...
Conditional data generation in 4 lines of code
Augment or balance your ML datasets in minutes with state-of-the-art generative models.
Read more...
Introducing Gretel Amplify
Generate large volumes of tabular synthetic data at high speed.
Read more...
Generate synthetic data in 3 lines of code
Learn the simplest way to generate synthetic data without setting up your own infrastructure and GPUs.
Read more...
Practical Privacy with Synthetic Data
Implementing a practical attack to measure un-intended memorization in synthetic data models.
Read more...
ML Models: Understanding the Fundamentals
Machine learning models can be trained to recognize patterns in datasets. By utilizing algorithms, they can learn to make decisions based on these patterns.
Read more...
Create Synthetic Time-series Data with DoppelGANger and PyTorch
Generate synthetic time series data with Gretel.ai’s open-source PyTorch implementation of DoppelGANger.
Read more...
Diffusion models for document synthesis
Explore state-of-the-art image synthetics for business documents using diffusion models.
Read more...
How to safely work with another company's data
Data sharing is central to modern business but entails risks. Synthetic data can enable data sharing while reducing the risk of privacy-compromising linkage attacks.
Read more...
How accurate is my synthetic data?
Gretel’s new synthetic report is here, featuring a high-level score and metrics to help you assess the quality of your synthetic data.
Read more...
Create artificial data with Gretel Synthetics and Google Colaboratory
Use Gretel Synthetics and Colaboratory’s free GPUs to train a model to automatically generate fake, anonymized data with differential privacy guarantees.
Read more...
Q&A Series: Solving Privacy Problems with Synthetic Data
Answers to some questions about synthetic data that audience members submitted during Gretel's talk at The Rise of Privacy Tech’s Data Privacy Week 2022 conference.
Read more...
Creating synthetic time series data
A step-by-step guide to creating high quality synthetic time-series datasets with Python.
Read more...
Gretel's New Synthetic Performance Report
Gretel's Premium SDK now includes detailed reporting that shows you how accurate your synthetic data's statistical distributions and correlations are.
Read more...
Improving massively imbalanced datasets in machine learning with synthetic data
Use synthetic data to improve model accuracy for fraud, cyber security, or any classification task with an extremely limited minority class.
Read more...
Introducing Gretel's Privacy Filters
Create synthetic data that’s safer than ever. Our simple configuration file settings enable you to secure both your data and model from adversarial attacks.
Read more...
Optuna Your Model Hyperparameters
We explore the popular open-source package Optuna to demonstrate how you can optimize your model hyperparameters and build the best synthetic model possible.
Read more...
Why Nonprofits Should Care About Synthetic Data
How synthetic data can help nonprofits improve their business operations and their impact on the people they serve.
Read more...
The Evolution of Gretel's Developer Stack for Synthetic Data
Some of our newest product and technology initiatives that will ensure the Gretel platform continues to grow and evolve with the needs of modern data consumers.
Read more...
Synthetic Data Configuration Templates
Our new configuration templates will help you pick some of the right parameters needed to train your synthetic data models.
Read more...
How To Create Differentially Private Synthetic Data
A practical guide to creating differentially private, synthetic data with Python and TensorFlow.
Read more...
Gretel Synthetics: Introducing v0.10.0
Explore how to create a batch interface with the latest version of Gretel Synthetics on Google Colaboratory.
Read more...
Automatically Reducing AI Bias With Synthetic Data
Create a fair, balanced, privacy preserving version of the 1994 US Census dataset using gretel-synthetics.
Read more...
Build a synthetic data pipeline using Gretel and Apache Airflow
In this blog post, we build an ETL pipeline that generates synthetic data from a PostgreSQL database using Gretel’s Synthetic Data APIs and Apache Airflow.
Read more...
Evaluating Data Sampling Methods with a Synthetic Quality Score
An evaluation of the effect of sampling procedures on the quality of synthetic tabular data using Gretel.ai's Synthetic Quality Score (SQS).
Read more...
Measure the Quality of any Synthetic Dataset with Gretel Evaluate
Assessing the efficacy and quality of synthetic data with Gretel Evaluate API.
Read more...
Deep dive on generating synthetic data for Healthcare
Take a deep dive on training Gretel’s open-source, synthetic data library to generate electronic health records that protect individual privacy (PII).
Read more...
How to use Weights & Biases with Gretel.ai
How to use Weights & Biases’ ML hyperparameter sweeps tool to optimize the accuracy of your synthetic data.
Read more...
Walkthrough: Create Synthetic Data from any DataFrame or CSV
Train an AI model to create an anonymized version of your dataset using Python, Pandas, and gretel-synthetics.
Read more...
Advanced Data Privacy: Gretel Privacy Filters and ML Accuracy
A look at how using Gretel’s Privacy Filters to immunize synthetic datasets against adversarial attacks can impact machine learning accuracy.
Read more...
Synthetic Time Series Data Creation for Finance
How we generated high-quality synthetic time-series data for one of the largest financial institutions in the world.
Read more...