Back to all posts

Machine Learning

Copyright © 2022 Gretel.ai

How to Generate Synthetic Data: Tools and Techniques to Create Interchangeable Datasets

Synthetic data is algorithmically generated data that mirrors the statistical properties of the dataset it’s based on. Learn how to make high-quality synthetic data.
Read more...

Compare Synthetic and Real Data on ML Models with the new Gretel Synthetic Data Utility Report

Use Gretel Evaluate classification and regression tasks to validate synthetic data utility
Read more...
Copyright © 2022 Gretel.ai

Conditional Text Generation by Fine-Tuning Gretel GPT

Augment machine learning datasets with synthetically generated text and labels using an open-source implementation of GPT-3.
Read more...

Synthetic Data and the Data-centric Machine Learning Life Cycle

Gretel's synthetic data platform overcomes challenges across the data-centric machine learning life cycle to enable AI and ML solutions.
Read more...
Copyright © 2022 Gretel.ai

Red Teaming Synthetic Data Models

How we implemented a practical attack on a synthetic data model to validate its ability to protect sensitive information under different parameter settings.
Read more...

Machine Learning Accuracy Using Synthetic Data

Can synthetic data really be used in machine learning? We explore the utility of synthetic data created from popular datasets and tested on popular ML algorithms.
Read more...
Copyright © 2022 Gretel.ai

Test Data Generation: Uses, Benefits, and Tips

Test data generation is the process of creating new data that replicates an original dataset. Here’s how developers and data engineers use it.
Read more...

Transforms and Synthetics on Relational Databases

A walkthrough of our new multi-table transform and multi-table synthetics notebooks, which can be used independently or simultaneously.
Read more...
Copyright © 2022 Gretel.ai

What Is Data Simulation?

Data simulation is the process of using large quantities of data to predict events and validate models. Get the full data simulation definition.
Read more...

Teaching large language models to zip their lips with RLPF

Gretel introduces Reinforcement Learning from Privacy Feedback (RLPF), a novel approach to reduce the likelihood of a language model leaking private information.
Read more...

AWS + Gretel Synthetic Data Accelerator Program for Generative AI

How our new Synthetic Data Accelerator Program with AWS will help enterprises scale responsible AI systems fast.
Read more...

Introducing Gretel MLOps

Use Gretel's synthetic data platform to replace, augment, or balance training datasets within MLOps pipelines like Vertex AI, Azure ML, and Amazon SageMaker.
Read more...

Gretel announces partnership with Microsoft Azure and joins Microsoft for Startups Pegasus Program

Gretel’s privacy-first generative AI is now available to all Azure users as well as select enterprises through the Microsoft for Startups Pegasus Program.
Read more...

Optimize the Llama-2 Model with Gretel’s Text SQS

How Gretel's data quality analysis tools for evaluating generated text can help you optimize the performance LLMs, like the Llama-2 model.
Read more...

Prompting Llama-2 at Scale with Gretel

Discover how to efficiently use Gretel's platform for prompting Llama-2 on large datasets, whether you're completing answers, generating synthetic text, or labeling.
Read more...

How to Safely Query Enterprise Data with Langchain Agents + SQL + OpenAI + Gretel

How combining agent-based methods, LLMs, and synthetic data enables natural language queries for databases and data warehouses, sans SQL.
Read more...
Copyright (©) 2023 Gretel

Gretel GPT Sentiment Swap

Let’s fine tune and prompt a large language model to swap the sentiment of product reviews!
Read more...

Comprehensive Data Cleaning for AI and ML

Learn to prepare tabular data for AI and ML with an end-to-end data cleaning workflow.
Read more...

Predicting Patient Stay Durations in the ER with Safe Synthetic Data

Here's how a hospital uses Gretel to help forecast staffing and resource needs for their emergency care unit, and to identify emerging trends in outbreaks.
Read more...

Unlocking Adapted LLMs on Enterprise Data

Gretel GPT supports new, state-of-the-art LLMs, and makes it easier for you to trust the privacy and accuracy of LLMs for enterprise use-cases.
Read more...

Scale Synthetic Data to Millions of Rows with ACTGAN

Discover how Gretel ACTGAN can help businesses generate synthetic data at scale with improved accuracy, faster training, and reduced memory requirements.
Read more...
Copyright © 2023 Gretel.ai

Augmenting ML Datasets with Gretel and Vertex AI

How to utilize Gretel to create high-quality synthetic tabular data that you can use as training data for a classification model in Vertex AI.
Read more...

Synthetic Image Models for Smart Agriculture

Learn how synthetic image models can address data drift and improve a computer vision model's accuracy in unexpected conditions.
Read more...

Downstream ML classification with Gretel ACTGAN and PyCaret

Learn about downstream machine learning tasks and synthetic data with Gretel’s new ACTGAN model and the PyCaret library
Read more...
Copyright (c) 2022 Gretel.ai

Generate synthetic Taylor Swift-like lyrics using Gretel GPT

Celebrate Taylor Swift's The Eras Tour with lyrics generated with Gretel GPT, our synthetic language model.
Read more...
Copyright © 2022 Gretel.ai

Community Insights: Overcoming Medical Class Imbalance with Synthetic Data

An interview with one of Gretel's users on why medical practitioners turn to synthetic data when overcoming challenges with clinical data.
Read more...

Generate synthetic data in 3 lines of code

Learn the simplest way to generate synthetic data without setting up your own infrastructure and GPUs.
Read more...

Using generative, differentially-private models to build privacy-enhancing, synthetic datasets from real data.

We’re going to train and build our synthetic dataset off of a real-time public feed of e-bike ride-share data called the GBFS (General Bike-share Feed)
Read more...

ML Models: Understanding the Fundamentals

Machine learning models can be trained to recognize patterns in datasets. By utilizing algorithms, they can learn to make decisions based on these patterns.
Read more...

Common misconceptions about differential privacy

This article clarifies some common misconceptions about differential privacy and what it guarantees.
Read more...
Copyright © 2022 Gretel.ai

Create Synthetic Time-series Data with DoppelGANger and PyTorch

Generate synthetic time series data with Gretel.ai’s open-source PyTorch implementation of DoppelGANger.
Read more...
Copyright © 2022 Gretel.ai

Diffusion models for document synthesis

Explore state-of-the-art image synthetics for business documents using diffusion models.
Read more...

How to safely work with another company's data

Data sharing is central to modern business but entails risks. Synthetic data can enable data sharing while reducing the risk of privacy-compromising linkage attacks.
Read more...
Copyright (c) 2021 Gretel

Exploring NLP Part 1: Why Should a Privacy Engineering Company Care About NLP?

There is a lot of hype around NLP. In this post, we explore some of the criticisms and how you can use this technology responsibly.
Read more...
Copyright (c) 2021 Gretel

Exploring NLP Part 2: A New Way to Measure the Quality of Synthetic Text

By merging breakthrough research on text metrics with new types of embeddings, we produce a reliable metric that is highly correlated with human ratings.
Read more...

Create artificial data with Gretel Synthetics and Google Colaboratory

Use Gretel Synthetics and Colaboratory’s free GPUs to train a model to automatically generate fake, anonymized data with differential privacy guarantees.
Read more...

Q&A Series: Solving Privacy Problems with Synthetic Data

Answers to some questions about synthetic data that audience members submitted during Gretel's talk at The Rise of Privacy Tech’s Data Privacy Week 2022 conference.
Read more...
Gretel Workflow

How we accidentally discovered personal data in a popular Kaggle dataset

Learn about new features in Gretel, and how those features enabled us to discover personally identifiable information (PII) in a popular Kaggle dataset.
Read more...

README.V2

We founded Gretel based on our beliefs that data shouldn’t be scary.
Read more...
Credit: sylv1rob1 via ShutterStock

Create a Location Generator GAN

How to train a FastCUT GAN on public location data from a few cities to predict realistic e-bike locations across the world.
Read more...

Create high quality synthetic data in your cloud with Gretel.ai and Python

Create differentially private, synthetic versions of datasets and meet compliance requirements to keep sensitive data within your approved environment.
Read more...
Copyright (©) 2023 Gretel.

Install TensorFlow and PyTorch with CUDA, cUDNN, and GPU Support in 3 Easy Steps

Set up a cutting-edge environment for deep learning with TensorFlow 2.10, PyTorch, Docker, and GPU support.
Read more...

Improving massively imbalanced datasets in machine learning with synthetic data

Use synthetic data to improve model accuracy for fraud, cyber security, or any classification task with an extremely limited minority class.
Read more...
Copyright © 2022 Gretel.ai

What is Model Soup?

A brief exploration of model soup, the new ensembling technique that takes the average weights of multiple models to improve overall performance.
Read more...
Copyright (c) 2021 Gretel

Optuna Your Model Hyperparameters

We explore the popular open-source package Optuna to demonstrate how you can optimize your model hyperparameters and build the best synthetic model possible.
Read more...
Source: Kubkoo, via iStockPhoto

Reducing AI bias with Synthetic data

Generate artificial records to balance biased datasets and improve overall model accuracy.
Read more...

Gretel Synthetics: Introducing v0.10.0

Explore how to create a batch interface with the latest version of Gretel Synthetics on Google Colaboratory.
Read more...
Source: enjoynz, via iStockPhoto

Innovating With FastText and Table Headers

Look at how FastText word embeddings can help to quickly understand new datasets, and build more consistent labels for your own data.
Read more...
Copyright © 2022 Gretel Labs. All rights reserved.

What We’re Reading: Trends & Takeaways from the NeurIPS 2021 Conference

The Gretel research team's favorite trends and takeaways from the NeurlPS 35th Annual Conference on Neural Information Processing Systems.
Read more...

Evaluating Data Sampling Methods with a Synthetic Quality Score

An evaluation of the effect of sampling procedures on the quality of synthetic tabular data using Gretel.ai's Synthetic Quality Score (SQS).
Read more...

How to use Weights & Biases with Gretel.ai

How to use Weights & Biases’ ML hyperparameter sweeps tool to optimize the accuracy of your synthetic data.
Read more...
Copyright (c) 2021 Gretel.ai

Advanced Data Privacy: Gretel Privacy Filters and ML Accuracy

A look at how using Gretel’s Privacy Filters to immunize synthetic datasets against adversarial attacks can impact machine learning accuracy.
Read more...
Copyright © 2021 Gretel Labs. All rights reserved.

Synthetic Time Series Data Creation for Finance

How we generated high-quality synthetic time-series data for one of the largest financial institutions in the world.
Read more...