Create artificial data with Gretel Synthetics and Google Colaboratory
![](https://cdn.prod.website-files.com/5ec4696a9b6d337d51632638/63894bb3239ac91f58127b2c_63066e2f6a58269a25c98fb0.webp)
In this post we’ll use Gretel Synthetics and Google Colaboratory’s free GPUs to train a machine learning model to automatically generate fake, anonymized data with differential privacy guarantees.
Today we will walk through some of the new features in Gretel’s gretel_synthetics open-source synthetic data library ver 0.6.0 including:
- Google SentencePiece support for unsupervised tokenization, with configurable vocabulary size & character coverage.
- smart_open support to load datasets from AWS, GCP, Azure.
- Launch directly into Colaboratory.
Check out the walk-through screencast below, or click the Colab link to get started creating your own synthetic dataset!
![Image for post](https://cdn.prod.website-files.com/5ec4696a9b6d337d51632638/5f3da47c7cbcf76d831805e3_1*hONxGP_dblmjiS2VALQB4w.png)
![Image for post](https://cdn.prod.website-files.com/5ec4696a9b6d337d51632638/5f3da44af598cb2a811f450b_1*gtI9m1yiDYIdug9vfK7YOQ.gif)
For a deep dive on anonymizing precise location data, check out our previous deep dive on anonymizing scooter ride-share data, and how we discovered and partnered with Uber to fix privacy concerns in public ride-share feeds.