Gretel Navigator is Now Generally Available

Gretel announces general availability of Gretel Navigator, the world’s first compound AI system purpose-built to design high-quality tabular datasets from scratch
Copyright (c) Gretel
Copyright (c) Gretel

Gretel is proud to announce the general availability of Gretel Navigator, the world’s first agent-based, compound AI system purpose-built to design high-quality tabular datasets. Until now, acquiring data has not been a simple task. Data acquisition efforts strain organizational resources, come at exorbitant costs, introduce significant delays to projects, and are accompanied by a myriad of privacy concerns over appropriate data use.

With Navigator, anyone can iteratively create, edit, and augment high-quality tabular datasets with a simple prompt. It makes creating quality tabular datasets in seconds a real possibility and removes the data bottleneck for powering the next generation of AI.

All this is made possible by Gretel’s compound AI architecture that leverages multiple LLMs, including Gretel’s own models trained on datasets from more than ten different industry domains, that brings quality and speed to tabular data generation.

Since its release in beta last November, community adoption of Gretel Navigator has grown significantly. With more than 10,000 service requests, today the Gretel community uses Navigator for a diverse range of use cases, including: training foundation LLMs and industry-focused SLMs, fine-tuning language models for new tasks, creating RAG model evaluation datasets, building intelligent applications, and even powering personalized demos for solution engineering teams.

Alongside community usage, we are proud to announce enterprise launch partners with frontier AI teams at Databricks, EY, Google, and Microsoft as well as startups such as Athena Intelligence and Dataclay. With GA we are also introducing new features to Navigator, including the ability to edit existing datasets, additional support for upgraded state-of-the-art models, and much more.

Gretel Navigator Capabilities

Gretel Navigator offers a straightforward way to design data iteratively through a responsive experience, ensuring the data meets your needs in real-time. Navigator enables you to:

  • Easily and quickly generate tabular data from scratch using a text prompt or description
  • Generate data from schema, whether SQL, JSONL, or other coding languages
  • Generate data from a sample dataset

Now with general availability, Navigator is capable of enriching existing datasets by filling in missing fields, or adding or editing columns and data fields to your datasets.

All this is available in our easy-to-use playground console interface as well as fully-featured inference SDK.

Industry Leading Public Datasets Designed With Navigator

Just last month, we drummed up excitement through our release of the world’s largest Text-to-SQL dataset. This synthetic dataset, generated by Gretel Navigator, is now powering some of the world’s leading foundation models, and features almost 5k downloads from the Open Source community.

Continuing our commitment to democratizing data access, we announced the release of an industry-validated PII dataset for the financial services industry. This dataset promises significant improvements for NER models used to identify sensitive entities in large datasets and files. Historically, these models have struggled to detect industry-specific PII. With better training data, we’re excited to see the performance of these models rapidly improve.

In other exciting dataset news, we will also be announcing soon our riff on the most popular dataset on HuggingFace - “Synthetic Multilingual LLM Prompts”, as well as an application for generating question-answer truth pairs for evaluating RAG models from input documents—again, all generated by Navigator from the Gretel team.

What’s New — Highlights

Navigator is getting even better with general availability. Users can now select the state-of-the-art LLM best suited to your needs, including both Gretel and partner LLMs, to serve as the primary model called for data generation. For our power users, we have also increased maximum rows generated in a single job to 10k and would love to hear from those looking for even larger data generation capacities.

In addition to these improvements under the hood, our cloud console also now has a module for uploading example data. Uploading your own example datasets as a prompt to a model, simplifies the task of prompt engineering, removing the art out of prompt engineering and enabling you to show Navigator the exact type of table you would like it to seek inspiration from. We’re excited to see what the Gretel community achieves with these new Navigator capabilities and will continue to push out even more capabilities in the coming months.

Ready to design your first dataset?

You can access Gretel Navigator in the Gretel Console or using our SDK. If you are looking for inspiration check out our many walkthrough blogs and on-demands workshops featuring unique applications of Gretel Navigator, including:  RAG model evaluation with synthetic data, speeding up LLM development, and more.

Contributors (alphabetical by last name):

Danielle Ali, Kendrick Boyd, Erica Edge, Marjan Emadi, Stefan Gavrilovic, Johnny Greco, Matt Grossman, Alexa Haushalter, Arron Hunt, Murtaza Khomusi, Nikko Kolean, Ashley Langan, Yev Meyer, Piotr Mlocek, Manjesh Mogallapalli, Ashley Murray, John Myers, Dhruv Nathawani, Drew Newberry, Anastasia Nesterenko, Nicole Pang, Lipika Ramaswamy, Alex Ray, Laura Steadman, Amy Steier, Sami Torbey, Maarten Van Segbroeck, Ivy Wang, Nathan Walston, Alex Watson, Nina Xu