Machine Learning

March 18, 2025

How to Generate Synthetic Data: Tools and Techniques to Create Interchangeable Datasets

Synthetic data is algorithmically generated data that mirrors the statistical properties of the dataset it’s based on. Learn how to make high-quality synthetic data.

February 7, 2025

Compare Synthetic and Real Data on ML Models with the new Gretel Synthetic Data Utility Report

Use Gretel Evaluate classification and regression tasks to validate synthetic data utility

January 30, 2025

Conditional Text Generation by Fine-Tuning Gretel GPT

Augment machine learning datasets with synthetically generated text and labels using an open-source implementation of GPT-3.

November 14, 2024

Synthetic Data and the Data-centric Machine Learning Life Cycle

Gretel's synthetic data platform overcomes challenges across the data-centric machine learning life cycle to enable AI and ML solutions.

November 7, 2024

Red Teaming Synthetic Data Models

How we implemented a practical attack on a synthetic data model to validate its ability to protect sensitive information under different parameter settings.

October 10, 2024

Machine Learning Accuracy Using Synthetic Data

Can synthetic data really be used in machine learning? We explore the utility of synthetic data created from popular datasets and tested on popular ML algorithms.

October 7, 2024

Test Data Generation: Uses, Benefits, and Tips

Test data generation is the process of creating new data that replicates an original dataset. Here’s how developers and data engineers use it.

October 7, 2024

Transforms and Synthetics on Relational Databases

A walkthrough of our new multi-table transform and multi-table synthetics notebooks, which can be used independently or simultaneously.

September 12, 2024

What Is Data Simulation?

Data simulation is the process of using large quantities of data to predict events and validate models. Get the full data simulation definition.

August 30, 2024

Teaching large language models to zip their lips with RLPF

Gretel introduces Reinforcement Learning from Privacy Feedback (RLPF), a novel approach to reduce the likelihood of a language model leaking private information.

June 20, 2024

AWS + Gretel Synthetic Data Accelerator Program for Generative AI

How our new Synthetic Data Accelerator Program with AWS will help enterprises scale responsible AI systems fast.

Privacy

Machine Learning

Maarten Van Segbroeck

June 20, 2024

Introducing Gretel MLOps

Use Gretel's synthetic data platform to replace, augment, or balance training datasets within MLOps pipelines like Vertex AI, Azure ML, and Amazon SageMaker.

June 7, 2024

Gretel announces partnership with Microsoft Azure and joins Microsoft for Startups Pegasus Program

Gretel’s privacy-first generative AI is now available to all Azure users as well as select enterprises through the Microsoft for Startups Pegasus Program.

June 7, 2024

Optimize the Llama-2 Model with Gretel’s Text SQS

How Gretel's data quality analysis tools for evaluating generated text can help you optimize the performance LLMs, like the Llama-2 model.

June 7, 2024

Prompting Llama-2 at Scale with Gretel

Discover how to efficiently use Gretel's platform for prompting Llama-2 on large datasets, whether you're completing answers, generating synthetic text, or labeling.

June 7, 2024

How to Safely Query Enterprise Data with Langchain Agents + SQL + OpenAI + Gretel

How combining agent-based methods, LLMs, and synthetic data enables natural language queries for databases and data warehouses, sans SQL.

June 7, 2024

Gretel GPT Sentiment Swap

Let’s fine tune and prompt a large language model to swap the sentiment of product reviews!

June 7, 2024

Comprehensive Data Cleaning for AI and ML

Learn to prepare tabular data for AI and ML with an end-to-end data cleaning workflow.

June 7, 2024

Predicting Patient Stay Durations in the ER with Safe Synthetic Data

Here's how a hospital uses Gretel to help forecast staffing and resource needs for their emergency care unit, and to identify emerging trends in outbreaks.

June 7, 2024

Unlocking Adapted LLMs on Enterprise Data

Gretel GPT supports new, state-of-the-art LLMs, and makes it easier for you to trust the privacy and accuracy of LLMs for enterprise use-cases.

June 7, 2024

Scale Synthetic Data to Millions of Rows with ACTGAN

Discover how Gretel ACTGAN can help businesses generate synthetic data at scale with improved accuracy, faster training, and reduced memory requirements.

June 7, 2024

Augmenting ML Datasets with Gretel and Vertex AI

How to utilize Gretel to create high-quality synthetic tabular data that you can use as training data for a classification model in Vertex AI.

June 7, 2024

Synthetic Image Models for Smart Agriculture

Learn how synthetic image models can address data drift and improve a computer vision model's accuracy in unexpected conditions.

June 7, 2024

Downstream ML classification with Gretel ACTGAN and PyCaret

Learn about downstream machine learning tasks and synthetic data with Gretel’s new ACTGAN model and the PyCaret library

Machine Learning

Grace King

June 7, 2024

Generate synthetic Taylor Swift-like lyrics using Gretel GPT

Celebrate Taylor Swift's The Eras Tour with lyrics generated with Gretel GPT, our synthetic language model.

June 7, 2024

Community Insights: Overcoming Medical Class Imbalance with Synthetic Data

An interview with one of Gretel's users on why medical practitioners turn to synthetic data when overcoming challenges with clinical data.

June 7, 2024

Generate synthetic data in 3 lines of code

Learn the simplest way to generate synthetic data without setting up your own infrastructure and GPUs.

June 7, 2024

Using generative, differentially-private models to build privacy-enhancing, synthetic datasets from real data.

We’re going to train and build our synthetic dataset off of a real-time public feed of e-bike ride-share data called the GBFS (General Bike-share Feed)

June 7, 2024

ML Models: Understanding the Fundamentals

Machine learning models can be trained to recognize patterns in datasets. By utilizing algorithms, they can learn to make decisions based on these patterns.

June 7, 2024

Common misconceptions about differential privacy

This article clarifies some common misconceptions about differential privacy and what it guarantees.

June 7, 2024

Create Synthetic Time-series Data with DoppelGANger and PyTorch

Generate synthetic time series data with Gretel.ai’s open-source PyTorch implementation of DoppelGANger.

June 7, 2024

Diffusion models for document synthesis

Explore state-of-the-art image synthetics for business documents using diffusion models.

June 7, 2024

How to safely work with another company's data

Data sharing is central to modern business but entails risks. Synthetic data can enable data sharing while reducing the risk of privacy-compromising linkage attacks.

June 7, 2024

Exploring NLP Part 1: Why Should a Privacy Engineering Company Care About NLP?

There is a lot of hype around NLP. In this post, we explore some of the criticisms and how you can use this technology responsibly.

June 7, 2024

Exploring NLP Part 2: A New Way to Measure the Quality of Synthetic Text

By merging breakthrough research on text metrics with new types of embeddings, we produce a reliable metric that is highly correlated with human ratings.

June 7, 2024

Create artificial data with Gretel Synthetics and Google Colaboratory

Use Gretel Synthetics and Colaboratory’s free GPUs to train a model to automatically generate fake, anonymized data with differential privacy guarantees.

June 7, 2024

How to train a FastCUT GAN on public location data from a few cities to predict realistic e-bike locations across the world.

June 7, 2024

Create high quality synthetic data in your cloud with Gretel.ai and Python

Create differentially private, synthetic versions of datasets and meet compliance requirements to keep sensitive data within your approved environment.

June 7, 2024