Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
This is some text inside of a div block.
The Gretel Blog
Learn more about Synthetic Data from Gretel experts – engineers, data scientists and our AI research team.
Search results for "privacy"
Generate Synthetic Databases with Gretel Relational
Introducing Gretel Relational, enabling organizations to generate high-quality synthetic databases while preserving cross-table relationships.
Augmenting ML Datasets with Gretel and Vertex AI
How to utilize Gretel to create high-quality synthetic tabular data that you can use as training data for a classification model in Vertex AI.
Teaching large language models to zip their lips
Gretel introduces Reinforcement Learning from Privacy Feedback (RLPF), a novel approach to reduce the likelihood of a language model leaking private information.
Gretel and Google Cloud partner on synthetic data
Gretel and Google Cloud harness the power of synthetic data to accelerate safer adoption of generative AI in the enterprise.
Install TensorFlow and PyTorch with CUDA, cUDNN, and GPU Support in 3 Easy Steps
Set up a cutting-edge environment for deep learning with TensorFlow 2.10, PyTorch, Docker, and GPU support.
Bringing AI-generated images to enterprise use cases
Gretel's new image synthetics enable you to generate high-quality images at scale. Get started today with our free public preview and let us know what you think!
Anonymize tabular data to meet GDPR privacy requirements
Learn how to anonymize tabular data to meet GDPR standards using Gretel's synthetic data APIs.
Synthetic Image Models for Smart Agriculture
Learn how synthetic image models can address data drift and improve a computer vision model's accuracy in unexpected conditions.
Downstream ML classification with Gretel ACTGAN and PyCaret
Learn about downstream machine learning tasks and synthetic data with Gretel’s new ACTGAN model and the PyCaret library
Generate synthetic Taylor Swift-like lyrics using Gretel GPT
Celebrate Taylor Swift's The Eras Tour with lyrics generated with Gretel GPT, our synthetic language model.
Synthetic Data and the Data-centric Machine Learning Life Cycle
Gretel's synthetic data platform overcomes challenges across the data-centric machine learning life cycle to enable ML and AI solutions.
Introducing Gretel Benchmark
Benchmark is your toolkit to evaluate any synthetic data algorithm on any production dataset
Conditional data generation in 4 lines of code
Augment or balance your ML datasets in minutes with state-of-the-art generative models.
Announcing the Synthetic Data Community Discord
Gretel is proud to announce the launch of the Synthetic Data Community Discord server.
Generate time-series data with Gretel’s new DGAN model
Announcing the open beta release of our DGAN model type.
Community Insights: Overcoming Medical Class Imbalance with Synthetic Data
An interview with one of Gretel's users on why medical practitioners turn to synthetic data when overcoming challenges with clinical data.
An update to Gretel’s license to support continuous community growth and innovation
Gretel's Source Available License supports long-term growth and innovation in the synthetic data community
Generate synthetic data in 3 lines of code
Learn the simplest way to generate synthetic data without setting up your own infrastructure and GPUs.
How to safely work with another company's data
Data sharing is central to modern business but entails risks. Synthetic data can enable data sharing while reducing the risk of privacy-compromising linkage attacks.
Progress and Innovation - Women in AI
Get to know some of Gretel’s Applied Science team, their experience building state-of-the-art generative AI models, and advice for aspiring data scientists.
Gretel Smart-Seeding is auto-complete for your data
Smart-seeding lets you train a synthetic data model to auto-complete partial records and text.
The Evolution of Gretel's Developer Stack for Synthetic Data
Some of our newest product and technology initiatives that will ensure the Gretel platform continues to grow and evolve with the needs of modern data consumers.
Measure the Quality of any Synthetic Dataset with Gretel Evaluate
Assessing the efficacy and quality of synthetic data with Gretel Evaluate API.
Evaluating Data Sampling Methods with a Synthetic Quality Score
An evaluation of the effect of sampling procedures on the quality of synthetic tabular data using Gretel.ai's Synthetic Quality Score (SQS).
Data Simulation: Tools, Benefits, and Use Cases
Data simulation is the process of using large quantities of data to predict events and validate models.
Test Data Generation: Uses, Benefits, and Tips
Test data generation is the process of creating new data that replicates an original dataset. Here’s how developers and data engineers use it.
Create Synthetic Time-series Data with DoppelGANger and PyTorch
Generate synthetic time series data with Gretel.ai’s open-source PyTorch implementation of DoppelGANger.
Red Teaming Synthetic Data Models
How we implemented a practical attack on a synthetic data model to validate its ability to protect sensitive information under different parameter settings.
Conditional Text Generation by Fine Tuning Gretel GPT
Augment machine learning datasets with synthetically generated text and labels using an open-source implementation of GPT-3.
Diffusion models for document synthesis
Explore state-of-the-art image synthetics for business documents using diffusion models.
What is Model Soup?
A brief exploration of model soup, the new ensembling technique that takes the average weights of multiple models to improve overall performance.
Transforms and Synthetics on Relational Databases
A walkthrough of our new multi-table transform and multi-table synthetics notebooks, which can be used independently or simultaneously.
What is Synthetic Data?
Synthetic data is artificially annotated information that is generated by computer algorithms or simulations, commonly used as an alternative to real-world data.
ML Models: Understanding the Fundamentals
Machine learning models can be trained to recognize patterns in datasets. By utilizing algorithms, they can learn to make decisions based on these patterns.
Transforms and Multi-Table Relational Databases
How to de-identify a relational database for demo or pre-production testing environments while keeping the referential integrity of primary and foreign keys intact.
What is Data Anonymization?
Everything you need to know about anonymizing data and the techniques for mitigating privacy risks.
Simplifying Our APIs
Five new features that will make synthesizing data easier for busy developers and data scientists.
Gretel.ai + Illumina - Using AI to create safe, synthetic datasets for genomics
Promising evidence that state-of-the-art synthetic data models can produce artificial versions of even highly dimensional and complex genomic and phenotypic data.
Improving massively imbalanced datasets in machine learning with synthetic data
Use synthetic data to improve model accuracy for fraud, cyber security, or any classification task with an extremely limited minority class.
How to Generate Synthetic Data: Tools and Techniques to Create Interchangeable Datasets
Synthetic data is algorithmically generated data that mirrors the statistical properties of the dataset it’s based on. Learn how to make high-quality synthetic data.
Workshop: Generating Synthetic Data for Healthcare & Life Sciences
How to enable faster access to data for medical research with statistically accurate, equitable and private synthetic datasets.
Q&A Series: Solving Privacy Problems with Synthetic Data
Answers to some questions about synthetic data that audience members submitted during Gretel's talk at The Rise of Privacy Tech’s Data Privacy Week 2022 conference.
How to use Weights & Biases with Gretel.ai
How to use Weights & Biases’ ML hyperparameter sweeps tool to optimize the accuracy of your synthetic data.
Create a Location Generator GAN
How to train a FastCUT GAN on public location data from a few cities to predict realistic e-bike locations across the world.
Data Is More Valuable When It Can Be Shared
Today, we are thrilled to announce the general availability of Gretel's privacy engineering APIs and services.
Gretel Synthetics Frequently Asked Questions (FAQs)
Build differentially private synthetic datasets in Python.
Creating Synthetic Time Series Data for Global Financial Institutions – a POC Deep Dive
How we generated high-quality synthetic time-series data for one of the largest financial institutions in the world.
What We’re Reading: Trends & Takeaways from the NeurIPS 2021 Conference
The Gretel research team's favorite trends and takeaways from the NeurlPS 35th Annual Conference on Neural Information Processing Systems.
Advanced Data Privacy: Gretel Privacy Filters and ML Accuracy
A look at how using Gretel’s Privacy Filters to immunize synthetic datasets against adversarial attacks can impact machine learning accuracy.
Why Nonprofits Should Care About Synthetic Data
How synthetic data can help nonprofits improve their business operations and their impact on the people they serve.
Optuna Your Model Hyperparameters
We explore the popular open-source package Optuna to demonstrate how you can optimize your model hyperparameters and build the best synthetic model possible.
Common misconceptions about differential privacy
This article clarifies some common misconceptions about differential privacy and what it guarantees.
Veterans Day Reflections: Open source software and evacuation operations, a remarkable combination.
Quickly and safely aggregate geolocation data for location density analysis using a hexagonal grid system.
Automate Detecting Sensitive Personally Identifiable Information (PII)
Use Gretel.ai's APIs to continuously detect and protect sensitive data including credit cards, credentials, names, and addresses.
Got text? Use Named Entity Recognition (NER) to label PII in your data
Use Gretel’s NLP setting to label PII including people names and geographic locations in free text.
Why privacy by design matters more than ever
Today we announced that Gretel raised $50 million in funding to help us advance our mission to bring “privacy by design” to all developers.
Exploring NLP Part 2: A New Way to Measure the Quality of Synthetic Text
By merging breakthrough research on text metrics with new types of embeddings, we produce a reliable metric that is highly correlated with human ratings.
Exploring NLP Part 1: Why Should a Privacy Engineering Company Care About NLP?
There is a lot of hype around NLP. In this post, we explore some of the criticisms and how you can use this technology responsibly.
Introducing Gretel's Privacy Filters
Create synthetic data that’s safer than ever. Our simple configuration file settings enable you to secure both your data and model from adversarial attacks.
Instrumenting Kubernetes in AWS with Terraform and FluentBit
In this blog, we will use Fluent Bit to collect logs from AWS EKS cluster applications.
Build a synthetic data pipeline using Gretel and Apache Airflow
In this blog post, we build an ETL pipeline that generates synthetic data from a PostgreSQL database using Gretel’s Synthetic Data APIs and Apache Airflow.
Walkthrough: Create Synthetic Data from any DataFrame or CSV
Train an AI model to create an anonymized version of your dataset using Python, Pandas, and gretel-synthetics.
What's new in Beta2
Beta2 for Gretel.ai is all about delivering privacy engineering as a service through clean, simple APIs.
What is Privacy Engineering?
In this post, we will dive into what privacy engineering is, why it’s important, and some of the core use cases we are seeing that are enabled by privacy.
A guide to load (almost) anything into a DataFrame
Pandas provides so many options of reading data into a DataFrame, here's our short guide to ones that we found most useful.
Synthetic Data Configuration Templates
Our new configuration templates will help you pick some of the right parameters needed to train your synthetic data models.
Practical Privacy with Synthetic Data
Implementing a practical attack to measure un-intended memorization in synthetic data models.
Introducing the Gretel Bartender
A game-changing AI that will disrupt the cocktail industry and spin the world on its head.
Anonymize Data with S3 Object Lambda
Anonymize data at access time with Gretel and Amazon S3 Object Lambda.
How accurate is my synthetic data?
Gretel’s new synthetic report is here, featuring a high-level score and metrics to help you assess the quality of your synthetic data.
Machine Learning Accuracy Using Synthetic Data
Can synthetic data really be used in machine learning? We explore the utility of synthetic data created from popular datasets and tested on popular ML algorithms.
Here's what we learned about privacy engineering from 50+ companies and hundreds of developers.
Creating synthetic time series data
A step-by-step guide to creating high quality synthetic time-series datasets with Python.
Recognizing Data Privacy Day by Protecting Your Privacy
What if we could ensure that personal data was protected, benefiting not just the individual but also giving developers faster, worry-free access to data?
Reducing AI bias with Synthetic data
Generate artificial records to balance biased datasets and improve overall model accuracy.
Auto-anonymize production datasets for development
In this post, we walk through building a data pipeline that will automatically transform datasets so they can be safely used in development environments.
Automatically Reducing AI Bias With Synthetic Data
Create a fair, balanced, privacy preserving version of the 1994 US Census dataset using gretel-synthetics.
How To Create Differentially Private Synthetic Data
A practical guide to creating differentially private, synthetic data with Python and TensorFlow.
Gretel.ai Raises $12 Million in Series A to Safely Share, Build with Data
We are pleased to share that Gretel raised $12M in Series A funding. We're picking up strong momentum in our mission to help developers create safe data.
Load NER data into Elasticsearch
Create a simple workflow to perform Named Entity Recognition (NER) on sample data using Gretel and load the records into Elasticsearch.
November 2020 - What’s new in Gretel
We are releasing new features that make working with data easier by helping you deep dive into records, use blueprints to auto-anonymize data, and more.
Introducing Gretel Blueprints
We are launching Gretel Blueprints, making it easy to anonymize and balance datasets with just a few clicks.
Gretel's New Synthetic Performance Report
Gretel's Premium SDK now includes detailed reporting that shows you how accurate your synthetic data's statistical distributions and correlations are.
Create high quality synthetic data in your cloud with Gretel.ai and Python
Create differentially private, synthetic versions of datasets and meet compliance requirements to keep sensitive data within your approved environment.
NEW: Integrating with Gretel SDKs just got easier!
Learn how we are improving our product by adding new features that make connecting to Gretel easier, faster and more streamlined.
How to use Gretel’s new entity stream
We recently launched our new entity stream view in Gretel Cloud. See how you can view record streams from tagged entities in your data projects.
Fast data cataloging of streaming data for fun and privacy
Learn more about how Gretel's REST APIs automatically build a metastore that makes it easy to understand what is inside of your data.
How we accidentally discovered personal data in a popular Kaggle dataset
Learn about new features in Gretel, and how those features enabled us to discover personally identifiable information (PII) in a popular Kaggle dataset.
Innovating With FastText and Table Headers
Look at how FastText word embeddings can help to quickly understand new datasets, and build more consistent labels for your own data.
Gretel Synthetics: Introducing v0.10.0
Explore how to create a batch interface with the latest version of Gretel Synthetics on Google Colaboratory.
Automated Data Exposure Detection with Gretel Outpost
Gretel Outpost is a free integration architecture that automates the steps that a security team would take in assessing the risk or exposure to data.
Contact Tracing: Deep Dive & Simulation
We decided to examine the privacy preserving capabilities of the Contact Tracing proposal, how it would be implemented, and what privacy concerns exist.
Deep dive on generating synthetic data for Healthcare
Take a deep dive on training Gretel’s open-source, synthetic data library to generate electronic health records that protect individual privacy (PII).
Create artificial data with Gretel Synthetics and Google Colaboratory
Use Gretel Synthetics and Colaboratory’s free GPUs to train a model to automatically generate fake, anonymized data with differential privacy guarantees.
Using generative, differentially-private models to build privacy-enhancing, synthetic datasets from real data.
We’re going to train and build our synthetic dataset off of a real-time public feed of e-bike ride-share data called the GBFS (General Bike-share Feed)