Generate synthetic Taylor Swift-like lyrics using Gretel GPT

Celebrate Taylor Swift's The Eras Tour with lyrics generated with Gretel GPT, our synthetic language model.
Copyright (c) 2022 Gretel.ai
Copyright (c) 2022 Gretel.ai

As tickets go on sale for Taylor Swift’s The Eras Tour, we eagerly await the opportunity to see our favorite artist perform live after four long years. Unlike her previous tours, which centered around a single album, the Eras Tour will range over Swift’s entire catalog of over 200 songs. Despite Midnights being released just a few weeks ago, we simply can’t get enough. So, we decided to make some more lyrics by using Gretel GPT to generate synthetic Taylor Swift lyrics.

Gretel GPT is our language synthetics model designed to enable generation of natural language text. The underlying model is a generative pre-trained transformer (GPT) designed using an open-source implementation of OpenAI’s GPT-3 architecture. GPT can be used to create high quality, coherent text … and in this case, Taylor Swift lyrics. Since large-scale language models may produce explicit content, we recommend having a human (ideally Taylor herself) curate or filter the outputs, both to censor undesirable content and to improve the quality of the results. Craving more Taylor as you wait for the Eras tour or for her next album release? Follow along with our notebook and generate your own synthetic Taylor Swift lyrics!

Load and preview training data

After logging in with your Gretel API key, we’ll preview our training data. GPT accepts a single-column file with natural language data such as reviews, tweets, or conversations. Here, our training data is a single-column file containing the lyrics to Taylor Swift songs.

DATASET_PATH = 'https://gretel-public-website.s3.us-west-2.amazonaws.com/datasets/taylor_swift_lyrics/TaylorSwiftLyrics.csv' 
df = pd.read_csv(DATASET_PATH, usecols=['text'])

# Print human-friendly preview of training data
print(df['text'][0])

The first row of the training data contains the lyrics to the record-breaking hit “All Too Well (10 Minute Version) (Taylor’s Version) (From The Vault).”

I walked through the door with you, the air was cold
But somethin' 'bout it felt like home somehow
And I left my scarf there at your sister's house
And you've still got it in your drawer, even now

Oh, your sweet disposition and my wide-eyed gaze
We're singin' in the car, getting lost upstate
Autumn leaves fallin' down like pieces into place
And I can picture it after all these days

And I know it's long gone and
That magic's not here no more
And I might be okay, but I'm not fine at all
Oh, oh, oh

Before we get distracted thinking about how Jake Gyllenhal needs to give Taylor her scarf back … let's move on to model configuration.

Model configuration

The next step in our synthetic lyric quest is creating a configuration file for our model. The configuration file is a set of instructions that tells the model how to train. Before training on your data, Gretel GPT loads the model specified by the pretrained_model parameter. The pretrained_model can be any valid GPT model in the HuggingFace model repository. In this notebook, we’ll load EleutherAI’s GPT-Neo as our pre-trained model and train the model using real Taylor Swift lyrics to generate our synthetic lyrics.

config = {
  "models": [
    {
      "gpt_x": {
        "data_source": "__",
        "pretrained_model": "EleutherAI/gpt-neo-125M",
        "batch_size": 4,
        "epochs": 3,
        "weight_decay": 0.01,
        "warmup_steps": 100,
        "lr_scheduler": "linear",
        "learning_rate": 0.0002,
        "validation": 5
      }
    }
  ]
}

Train the GPT model

Creating and submitting our model for training takes just a few lines of code. Our model config is the file we defined above, and our data source is the training data containing real Taylor Swift lyrics. 

# Designate project
PROJECT = 'taylor-swift-lyrics'
project = create_or_get_unique_project(name=PROJECT)

# Create and submit model
model = project.create_model_obj(model_config=config, data_source=df)
model.name = f"{PROJECT}-gpt"
model.submit_cloud()

poll(model)

Now, our GPT model has begun training! We can use the poll(model) function to track its progress. 

Training progress is shown in real-time.

Generate lyrics

Once the model finishes training, it's time to generate some Taylor Swift lyrics. Gretel GPT supports both unconditional and conditional generation modes. If you’re interested in learning more about conditional text generation, check out our blog on the topic. For this example, we will use unconditional generation to really give synthetic Taylor full artistic control.

params={"maximum_text_length": 200, "top_p": 0.95, "num_records": 1}
record_handler = model.create_record_handler_obj(params = params)
record_handler.submit_cloud()
poll(record_handler)

Once again, we use the poll function to track the model’s progress. 

Successful record generation with Gretel Cloud.

View results

Now that generation has completed successfully, we can take a look at our results. 

gpt_output = pd.read_csv(record_handler.get_artifact_link("data"), compression='gzip')
print(gpt_output['text'][0])
I've been up early, but I'm still late to the party
I didn't really have time for much conversation
I've made little appearances in a few places
I've lost sight of what the party is like
I've never had the time to meet so much people
I'm down here in a pub
My husband's a thief and a murderer
And what do you think that makes me, in your name
Your life could be all yours
Oh, it's a hard way to feel like I'm not around
But I know that it's hard to feel alive

While clearly not up to the writing skills of Ms. Swift (can anything be?), Gretel GPT produced some Swift-esqe lyrics. The synthetic lyric “My husband’s a thief and a murderer” gives similar vibes to evermore’s murder mystery song “no body, no crime.” “I’m down here in a pub” reminds me of Lover’s “London Boy.” 

Some other lyrics we generated include:

  • I was on an empty stomach; It was all a very bad idea
  • 'Cause this is the last time I'm afraid; To come to blows with you on a train
  • But I guess the only thing that keeps me awake is me
  • And I realized it was the first time; That I had heard that sound before
  • I thought of you, your beautiful eyes and your sweet voice and all that you have in the world
  • We have a history that can't be told

Conclusion

We’re excited to see your results! Tweet us @gretel_ai with your best synthetic Taylor lyrics and the hashtags #TSTheErasTour and #SyntheticTaylor for a chance to win a free Gretel t-shirt. 

What should we try next? Perhaps we'll generate Taylor Swift album covers using our upcoming image synthetics. If you’d like to explore other sample notebooks, check out Gretel Blueprints.