It’s way too hard- sometimes seemingly impossible- to safely share and collaborate with sensitive data. We have a solution to this problem that we and every developer faces each day. We founded Gretel based on our beliefs that data shouldn’t be scary, and for you to compete in today’s world, you need to be able to use and learn from your data.
Companies like Amazon, Google, and Apple have the resources to give the developers the best of both worlds- data privacy and streamlined access to data. We’re here to make that possible for any developer.
In February we published our first README and started laying out our goal of enabling developers to safely share and collaborate with sensitive data, and our vision of democratizing building with data so everyone can use it. We asked for your feedback and ideas, and promised to share research, open source code, and provide examples.
In the 6 months since then, we have had conversations with nearly 100 developers and companies to understand the barriers to working with sensitive data and how we can apply privacy-enhancing technology to break down those barriers. Here is what we learned:
- It can take developers months to get access to sensitive data to test an idea. Often this requires PM and legal approvals, snap-shots of production databases, and manual anonymization of sensitive fields.
- Privacy is an engineering problem, not a policy problem. Policies are open to interpretation, lack enforceability at different stages of a workflow, and eventually get abused.
- Fairness and ethics in AI is incredibly important. Datasets used to power AI in our lives are often limited and imbalanced, leading to bias against users and groups.
In the past year, we have built a set of open-source SDKs that enable developers to label and share access to data, composable APIs to enable transformations to streaming data, and an open-source AI-based synthetic data library that can generate artificial datasets from sensitive data with provable privacy guarantees, and automatically boost minority classes in datasets to reduce AI bias.
Today, we are thrilled to release Gretel’s public beta to any developer. It’s free, and you can get started in minutes with one of our guides for labeling and sharing a dataset in 2 minutes, or even generating your first synthetic dataset with differential privacy guarantees.