Gretel Navigator - Introducing SDK Support and Tabular Edit Mode
Video description
Two new features that enable you to easily edit or augment your existing data using Gretel Navigator
More Videos
Read the blog post
Transcription
Speaker 1 (00:00):
Hey, this is Alex from Gretel. Today we're going to walk through two new exciting features that we're launching as part of tabular LLM. The first is the ability to interact with the tabular LM model via a notebook, SDK or API, which as you can see, we'll walk through a notebook here that interacts with the Gretel tabular LM service. Second use case is the ability to edit or augment your existing data. So if you want to add a new column to an existing data set or clean values or fill in missing values, the edit mode is going to be your friend. So we'll walk through both examples. First, we install the grotto client dependencies. Next we go through configuring Gretel session. You're going to need an API key here. So you go over to the Gretel console, which you can sign into from Gretel, do ai, come back, grab your API key, paste it in here. We'll go ahead and get started. Next step here is creating a model. So we're going to instantiate the Gretel tab, LM model inside of our project. That takes just a second here. It's going to fire up a cloud worker and instantiate this model once that we can use after it's set up for inference.
(01:07)
Done. We'll go through and now we're going to go ahead and prompt the model. The same example that you've seen in previous videos that I think we have as one of our console examples, creating a mock dataset for users from France. So here we can see you're creating the prompt as a string. We're also passing in the LLM prams that you would configure as part of the playground here. So you can see we're asking for 10 records. We're setting the LM temperature to 0.8. You can turn that up a little bit if you want a little bit more variation or variety in your records. You can see the model is working on it right now. It's not quite as fast as our inference APIs because it is using the batch mode here on the backend, but it scales to whatever side you need. So here we can see, let's go ahead and return a nice set of results. Looks exactly like what we're looking for. So that's a first example. Now we'll move into how we can augment an existing dataset. So what we'll do here is we'll take this exact dataset we just generated, and we're going to give it a prompt and say, add a new column, initials, which contains the initials of the person. You can see we're taking this data frame format, passing it in as the reference data variable to our SDK.
(02:16)
Give a second and let that run.
(02:22)
And we can see that model picked it up and went ahead and filled out the initials very quickly. So we got the initials right on right there. This will scale to whatever size datasets you need to run it against. So I would encourage scaling it up and share feedback as you go. Finally, getting into a more advanced use case, essentially using our model to create complex answers to questions or high diversity responses, things like that. This is particularly useful if you are training your own LLM model. Essentially using our model to create high quality synthetic examples that can be used to train your own. We'll dive into this a lot more in previous video and content or future video and content that we build. But we can see here is we're asking a set of relatively complex questions. These came from the grade school math eight K dataset. We're asking our model and the prompt here, Kevin, you column answer, which contains a detailed step-by-step answer to the question in each row. You can see it went through here pretty fast. It took the data from that. We gave it and gives you a high quality step-by-step answer to that query that you have. So really fast labeling of data becomes possible when you can interact as a kind of a tabular mode. I.