Create artificial data with Gretel Synthetics and Google Colaboratory

Use Gretel Synthetics and Colaboratory’s free GPUs to train a model to automatically generate fake, anonymized data with differential privacy guarantees.

In this post we’ll use Gretel Synthetics and Google Colaboratory’s free GPUs to train a machine learning model to automatically generate fake, anonymized data with differential privacy guarantees.

Today we will walk through some of the new features in Gretel’s gretel_synthetics open-source synthetic data library ver 0.6.0 including:

  • Google SentencePiece support for unsupervised tokenization, with configurable vocabulary size & character coverage.
  • smart_open support to load datasets from AWS, GCP, Azure.
  • Launch directly into Colaboratory.

Check out the walk-through screencast below, or click the Colab link to get started creating your own synthetic dataset!

Image for post
Try out Gretel-Synthetics in Google Colaboratory
Image for post
https://vimeo.com/400326654

For a deep dive on anonymizing precise location data, check out our previous deep dive on anonymizing scooter ride-share data, and how we discovered and partnered with Uber to fix privacy concerns in public ride-share feeds.