Synthetic Data As A Service For Simplifying Privacy Engineering With Gretel

Any time that you are storing data about people there are a number of privacy and security considerations that come with it. Privacy engineering is a growing field in data management that focuses on how to protect attributes of personal data so that the containing datasets can be shared safely. In this episode Gretel co-founder and CTO John Myers explains how they are building tools for data engineers and analysts to incorporate privacy engineering techniques into their workflows and validate the safety of their data against re-identification attacks.

Interview

  • Introduction
  • How did you get involved in the area of data management?
  • Can you describe what Gretel is and the story behind it?
  • How do you define "privacy engineering"?
  • In an organization or data team, who is typically responsible for privacy engineering?
  • How would you characterize the current state of the art and adoption for privacy engineering?
  • Who are the target users of Gretel and how does that inform the features and design of the product?
  • What are the stages of the data lifecycle where Gretel is used?
  • Can you describe a typical workflow for integrating Gretel into data pipelines for business analytics or ML model training?
  • How is the Gretel platform implemented?
  • How have the design and goals of the system changed or evolved since you started working on it?
  • What are some of the nuances of synthetic data generation or masking that data engineers/data analysts need to be aware of as they start using Gretel?
  • What are the most interesting, innovative, or unexpected ways that you have seen Gretel used?
  • What are the most interesting, unexpected, or challenging lessons that you have learned while working on Gretel?
  • When is Gretel the wrong choice?
  • What do you have planned for the future of Gretel?