Learning Pathway

/

Create a free account

Get a head start on this pathway by signing up for a free Gretel account.
Get Started

Introduction

Welcome to Gretel! We're excited that you are interested in learning more about the benefits of synthetic data. In this learning pathway, we have gathered our best resources on fundamental concepts you can navigate through at your own pace. By the end of this course, you will have learned how to use Gretel's Toolkit to generate synthetic data that's as good or even better than your data.

Getting Started

“Synthetic data is artificially annotated information that is generated by computer algorithms or simulations, commonly used as an alternative to real-world data.”

In this section, you will learn what synthetic data is, why it's necessary, and how you can use it to solve many unique use cases.

Total Reading time 37 minutes
What is Synthetic Data?

This blog post is an overview of the entire Synthetic Data space. If you are new to Synthetic Data, this is a must read.

Reading time 14 minutes
How Accurate is Synthetic Data?

This blog post discusses how Gretel ensures that your Synthetic Data maintains the same statistical properties as the original dataset. It details our Synthetic Data Quality Score and how you can use it.

Reading time 5 minutes
Data is More Valuable When It Can Be Shared

One of the biggest bottlenecks to innovation that developers and data scientists face today is getting access to data, or creating the data that you need to test an idea or build a new feature. This blog post discusses Gretel’s commitment to helping developers make their code safe to share, therefore enabling for faster innovation.

Reading time 2 minutes
What is Privacy Engineering?

Privacy engineering the systematic application of engineering concepts for protecting sensitive information. This blog post discusses the importance of privacy engineering and the value it can bring to your organization

Reading time 3 minutes
Why Should Everyone Care About Synthetic Data?

Originally titled “Why Should Nonprofits Should Care About Synthetic Data”, this blog is applicable to more than Nonprofit organizations. It discusses the benefits Synthetic Data has regarding data privacy as well as leveraging limited data sets.

Reading time 5 minutes
Synthetic Data FAQs

Have questions about Synthetic Data? Check out our Frequently Asked Questions.

Reading time 8 minutes
Back to Top

Setup

In this section you’ll create a Gretel account, walk through initial setup of your account, and learn how to use Gretel in your personal development environment.

Initial Setup

An API Key is necessary to use Gretel’s products. Use this portion of the docs to learn how to generate your API Key.

Just a few minutes

Environment Setup

Gretel’s products support a multitude of ways to be used. From a no-code solution to running on prem, the below links detail how to get started using Gretel in your preferred environment.

Cloud Console Setup

Gretel Cloud Console offers a no-code solution for its products.

SDK Setup

Gretel has an open source Python SDK available. For examples demonstrating the SDK, check out the docs/notebooks directory in Gretel Blueprints.

CLI Setup

If the command line is your home, Gretel has a CLI tool that you can download and run. Each of the Gretel products have examples demonstrating using the CLI to create synthetic data, classify, transform, and more.

Run Locally

The default option for running Gretel workloads is to run them in Gretel’s cloud. If you would like to run the workload on your hardware, follow the above instructions to setup Docker with a GPU to execute your workload.

Back to Top

The Gretel Toolkit

Gretel's Toolkit comprises three primary products: Gretel Synthetics, Gretel Classify, and Gretel Transform. In this section, you'll learn about Gretel's products, what each can do, and which product is best suited to solve the unique problems you encounter.

Choosing the right model

Don’t know which model to use? This helpful decision tree will help you determine which product you need.

Synthetics

Learn how to create and modify a synthetic data model configuration before model training to support different data types and privacy protections.

Classify

Define a policy to discover and label sensitive data including personally identifiable information, credentials, and even custom regular expressions inside text, logs, and other structured data.

Transform

Learn how to define a policy to label and transform a dataset, with support for advanced options including custom regular expression search, date shifting, and fake entity replacements.

Back to Top

Next Steps

Are you excited to use Gretel for your data engineering needs? We sure hope so. Now that you have completed this course, we recommend checking out a these helpful links:

Now that you’ve explored Gretel, you can walk through this our notebook to generate your first set of synthetic data. This notebook is a great starting point for those new to Synthetic Data.

Looking for sample code? Check out our Gretel Blueprints repository on GitHub containing many examples demonstrating many of the common (and uncommon) use cases of Gretel products. This repository is well maintained and constantly being updated, so be sure to Star it to get updates when we release new blueprints.

These resources merely scratch the surface of the capabilities of Gretel. Be sure to checkout the Documentation and Blog for everything else.

Gretel Community

Interested in getting the latest news about Synthetic Data? Want to chat with others who are using Synthetic Data in their workloads?

For any other questions, comments, or concerns, please join our Community Slack and let us know.

Additional Resources

Here’s some of our favorite resources regarding Synthetic Data

Back to Top

Make your job easier instantly.

Get started in just a few clicks.

Connect with the Gretel Community.

Join our Slack community to connect with the Gretel team and engage with our community.

Slack logo
Join gretelgroup.slack.com