Introducing the new high-level interface for Gretel's Python SDK
Video description
A quick preview of Gretel's new Python SDK
More Videos
Read the blog post
Transcription
Speaker 1 (00:00):
Hey, it's Johnny from Gretel, and today I'm excited to share the new high level interface for our Python SDK. This is a quick preview of what the interface looks like. When we designed this, our goal was to provide our users with a simple intuitive, with really just enjoyable experience when building with Python and Gretel. So you can see with just a few lines of code, you can get up and running and start straining a state-of-the-art model from scratch on your own data. You can then quickly assess the quality of the synthetic data generated by your model by looking at the report that Gretel creates for you automatically. Then once you're happy with your model, it's super easy to generate synthetic data on demand so you can integrate it into your existing pipelines. So let's take a look at this in action. I'm going to walk us through our Gretel 1 0 1 blueprint, which is really a great place to start if you want to learn the fundamentals of the Gretel Python SDK. The first thing you're going to need to do if you haven't already done so, is sign up for a free Gretel account. Let's come here and that'll give you access to our platform.
(01:13)
Next, we need to install the Gretel client. So this is our Python, SDKI went ahead and installed that earlier. Next, we're going to instantiate this Gretel object, and this is really where the Python high level interface is implemented. Let's do that. It's going to ask us for our Gretel API key, which we can retrieve in the console. So this is a link for that copy. There we go. Alright, we're in. Next, we need to select the dataset we would like to synthesize. So there's a few different choices for tabular data sets. Let's choose this first one that's an adult income dataset. So set the path. Let's preview the data. Alright, so we see it's tabular. We've got a mix of numerical and categorical data, some integer and strings. So pretty standard tabular dataset that we would like red to synthesize. Alright, now the fun part, let's go ahead and train a deep generative model. So we're going to use our tabular athan model. Alright, right away you see there are some interesting logs being printed. There is a project URL, so this will take you to the project on the Gretel platform. In the console there's also a link to docs for athan, which is super useful if you want to read about the model and learn about the different options you have for parameters. And then finally, there's a console, URL, for the actual model training job that's going on. Let's click that one.
(02:47)
Alright, and so this is where it'll take you to the console and allow you to observe the training as it's happening. And this will take a few minutes, so I want to go ahead and pause and be back when it's finished. Okay, awesome. I'm back. So it looks like that took five minutes to complete. Next we would like to evaluate the quality of the synthetic data. So let's do that with here. So we see we've got this report object that we can print, and we've got several different scores that get measured and looks like we're doing pretty good. These are out of a hundred. If you want to look at all the details of the report that was created by Gretel, we can go ahead and display that in our notebook right here. Lots of fun graphs and things to dig into if you're interested to really understand why the scores were given.
Speaker 2 (03:44):
Okay,
Speaker 1 (03:45):
Next, we can actually fetch the synthetic data that was used to create the report if we're interested in looking at it and maybe making some of our own plots. Very cool. Alright, finally, let's say we love our model and we'd like to just start generating synthetic data on demand. So here I'm going to generate a thousand records using the model that we just trained. So again, we get some links to
(04:12)
The docs and to the console. So this will take us back to the model and we can see here that we've got a job that's spun up and it's going to generate synthetic data for us. This will take a couple minutes, so I'll go ahead and pause again. Alright, looks like that took less than a minute. About 43 seconds. Very cool. So that generated a thousand records, which we can quickly take a look at through this synthetic data frame. And so here again, you're free to integrate this into your pipeline, make plots if you want to look at more detail or do whatever it is you need to do with your synthetic data. So that's a quick walkthrough of the new high level interface and this Gretel 1 0 1 blueprint. If you want to learn more, I encourage you to head over to our doc. So this is docs gretel.ai where you can read more about the SDK. And then in particular, I wanted to highlight these SDK blueprints right here. And so these are the first one we just went through is the Gretel 1 0 1 blueprint. Then we've got another one that's more advanced usage for our tabular models. And then there's another blueprint that focuses on text generation. So please, I encourage you to go make a Gretel account and give these blueprints a shot. Alright, so thanks a lot. See you later.