Back to all posts

Open Source

Gretel Unlocks PII Detection with Synthetic Financial Document Dataset

Gretel releases a new synthetic financial document dataset to empower AI developers in building customized and highly performant sensitive data detection systems.
Read more...

Addressing Concerns of Model Collapse from Synthetic Data in AI

How thoughtful, high-quality synthetic data generation, rather than 'indiscriminate' use, can prevent model collapse.
Read more...
Copyright © 2022 Gretel.ai

Conditional Text Generation by Fine-Tuning Gretel GPT

Augment machine learning datasets with synthetically generated text and labels using an open-source implementation of GPT-3.
Read more...

Fine-Tuning CodeLlama on Gretel's Synthetic Text-to-SQL Dataset using Amazon SageMaker JumpStart

Fine-tune CodeLlama with Gretel's Synthetic Text-to-SQL on BIRDBench, achieving a 36% relative improvement in EX and 38% in VES.
Read more...

Accelerating FinTech Innovation with Natural Language to Code

Train financial LLMs with Gretel's Synthetic Text-to-Python dataset to transform natural language into precise, domain-specific Python code for FinTech.
Read more...

GLiNER Models for PII Detection through Fine-Tuning on Gretel-Generated Synthetic Documents

Gretel fine-tuned, synthetically-enhanced GLiNER models for better PII & PHI detection—datasets included.
Read more...

An Awesome Synthetic Multilingual Prompts Dataset

Gretel's latest open synthetic dataset aims to enhance LLM interactions and contributes to the popular 'awesome-chatGPT-prompts' GitHub repository.
Read more...

Introducing world's largest synthetic open-source Text-to-SQL dataset

Gretel releases largest open source Text-to-SQL dataset to accelerate AI model training
Read more...

The explosion of small language models (SLMs) and license confusion

Rapid SLM releases highlight the need for clarity on licenses + lineage, which are crucial for enterprises navigating open-weight models and synthetic data ownership
Read more...

We just streamlined Gretel’s Python SDK

Discover the streamlined Gretel Python SDK. Start building with synthetic data in just 3 lines of code 🚀
Read more...

Generate synthetic data in 3 lines of code

Learn the simplest way to generate synthetic data without setting up your own infrastructure and GPUs.
Read more...

CHANGELOG: Beta2

Here's what we learned about privacy engineering from 50+ companies and hundreds of developers.
Read more...
Copyright © 2022 Gretel.ai

Create Synthetic Time-series Data with DoppelGANger and PyTorch

Generate synthetic time series data with Gretel.ai’s open-source PyTorch implementation of DoppelGANger.
Read more...

Introducing Gretel Blueprints

We are launching Gretel Blueprints, making it easy to anonymize and balance datasets with just a few clicks.
Read more...

README.V2

We founded Gretel based on our beliefs that data shouldn’t be scary.
Read more...
Credit: sylv1rob1 via ShutterStock

Create a Location Generator GAN

How to train a FastCUT GAN on public location data from a few cities to predict realistic e-bike locations across the world.
Read more...
Copyright (©) 2023 Gretel.

Install TensorFlow and PyTorch with CUDA, cUDNN, and GPU Support in 3 Easy Steps

Set up a cutting-edge environment for deep learning with TensorFlow 2.10, PyTorch, Docker, and GPU support.
Read more...

Veterans Day Reflections: Open source software and evacuation operations, a remarkable combination.

Quickly and safely aggregate geolocation data for location density analysis using a hexagonal grid system.
Read more...
Copyright (c) 2021 Gretel

Optuna Your Model Hyperparameters

We explore the popular open-source package Optuna to demonstrate how you can optimize your model hyperparameters and build the best synthetic model possible.
Read more...
Source: Kubkoo, via iStockPhoto

Reducing AI bias with Synthetic data

Generate artificial records to balance biased datasets and improve overall model accuracy.
Read more...

Synthetic Data Configuration Templates

Our new configuration templates will help you pick some of the right parameters needed to train your synthetic data models.
Read more...
Copyright 2021 Gretel.

Instrumenting Kubernetes in AWS with Terraform and FluentBit

In this blog, we will use Fluent Bit to collect logs from AWS EKS cluster applications.
Read more...
Source: enjoynz, via iStockPhoto

Innovating With FastText and Table Headers

Look at how FastText word embeddings can help to quickly understand new datasets, and build more consistent labels for your own data.
Read more...

Auto-anonymize production datasets for development

In this post, we walk through building a data pipeline that will automatically transform datasets so they can be safely used in development environments.
Read more...
Copyright © 2022 Gretel.ai

Measure the Quality of any Synthetic Dataset with Gretel Evaluate

Assessing the efficacy and quality of synthetic data with Gretel Evaluate API.
Read more...

How to use Weights & Biases with Gretel.ai

How to use Weights & Biases’ ML hyperparameter sweeps tool to optimize the accuracy of your synthetic data.
Read more...