Fine-tuning GPT-3 to Write in the Style of George R.R. Martin (Or Any Author)

10 min readFeb 24, 2023

Overview

The goal of this article is to get any one interested to fine-tune the GPT-3 language model to generate text in the voice and style of the George R.R. Martin or almost any author for that matter. For our case for simplicity, we will stick with George here. To accomplish this, we will need a corpus of Martin's books or articles and prepared for use as training data. The fine-tuning process will involve using OpenAI’s API to train the GPT-3 model on the text corpus, as well as experimenting with different fine-tuning methods and hyper-parameters to optimize the model’s performance.

The flow chart below visually explains what the general flow of the process will be.

Preparing the Text Corpus

The first step in this project is to prepare the text corpus of Martin’s books for use as training data. This involves several steps:

Digitizing the books

First, we will need to use a book scanner or similar tool to convert the physical books into digital form. This may involve OCR (optical character recognition) technology to recognize the text from the scanned pages. Just make sure to check the legalities of it.

Then, we will need to save these in a .txt file format for us to later read and prepare our dataset.

Cleaning and preprocessing the text

The digitized text may contain errors and formatting inconsistencies that will need to be cleaned and standardized. You may need to use software tools such as regular expressions and text editors to accomplish this. It will be very crucial to make the final text error free. Some key pre-processing tasks for our case are:

Removing non-textual content: This can include removing HTML tags, images, and other non-textual content that may be present in the data.
Spell checking and correction: This involves identifying and correcting spelling errors in the text, which can improve the readability and accuracy of the data.

Chunking the text

The full text of Martin's books may be too large to train the GPT-3 model on directly. Therefore, we will need to divide the text into smaller chunks or segments that can be processed more efficiently. This may involve using python to split the text into chapters or paragraphs.

Assuming that all our raw files are in the folder ./data/raw/(1.txt,2.txt etc)

import os

input_dir = "./data/raw"
output_dir = "./data/chunks"
chunk_size = 500  # adjust as needed

if not os.path.exists(output_dir):
    os.makedirs(output_dir)

for filename in os.listdir(input_dir):
    with open(os.path.join(input_dir, filename), "r") as f:
        text = f.read()
        chunks = [text[i:i+chunk_size] for i in range(0, len(text), chunk_size)]
        for i, chunk in enumerate(chunks):
            prompt = ""
            if i > 0:
                prompt = chunks[i-1]
            with open(os.path.join(output_dir, f"{filename}_{i}.json"), "w") as out_file:
                out_file.write('{"prompt":"' + prompt + '","completion":"' + chunk + '"}')

The above will chunk the text into prompt and completion sequence.

{
"prompt":"It was not until they were mounted and on their way that Bran allowed himself to taste the sweet air of victory.",
"completion":"By then, his pup was snuggled inside his leathers, warm against him, safe for the long ride home. Bran was wondering what to name him."
}

Note the above is only one method of chunking. We will come to other methods in the section below.

Fine-tuning Methods

There are several different techniques that can be used to fine-tune the GPT-3 model on the text corpus. Here are three potential methods:

Chunk-based prompts

This method involves providing the GPT-3 model with a chunk of text from Martin’s books as a prompt, and then asking the model to generate additional text that follows the same voice and style. The length of the prompt can vary, depending on the desired level of control over the generated text.

Example:

Prompt: “The desert stretched out before us, vast and silent, like a…”
Completion: “…hollow bowl waiting to be filled with light and sound. The sand was pale and cool to the touch, and the air was still except for the distant call of a coyote. We set up camp in the shadow of a rocky outcropping, and as the stars came out, we felt the weight of the desert’s ancient history bearing down on us.”

Keyword-based prompts

This method involves providing the GPT-3 model with a set of keywords or phrases that are characteristic of Martin's voice and style, and then asking the model to generate text that incorporates these keywords. This method allows for more flexibility in the generated text, but may require more experimentation to achieve the desired results.

Example:

Prompt: “The monkey wrench gang”
Completion: “roamed the desert like outlaws, their hearts aflame with a fierce love for the land. They fought against the forces of progress and development, and in doing so, they became legends. But even as they battled against the bulldozers and the chainsaws, they knew that their fight was far from over.”

Random sampling

This method involves allowing the GPT-3 model to generate text randomly, without any specific prompts or constraints. This method can be useful for exploring the full range of the model’s capabilities, but may produce text that is less coherent or consistent with Martin's voice and style.

Example:

Prompt:””
Completion: “The sun beat down on the canyon walls, casting long shadows across the dry riverbed. A hawk circled overhead, searching for prey, while a snake slithered through the rocks. We had been walking for hours, and the heat was starting to get to us. But we pressed on, driven by a sense of purpose that we couldn’t quite define. As we reached the top of the ridge, we saw a vista that took our breath away. In the distance, the mountains rose up like a row of jagged teeth, their peaks wreathed in clouds.”

For our purpose, I recommend that we use chunk based and keyword based prompts. Random sampling can be useful for generating longer pieces of text, such as essays or book chapters, but it may require more filtering and editing to ensure that the generated text stays consistent with the target author’s style. It’s also important to note that because random sampling is not constrained by any specific prompts or inputs, the quality and relevance of the generated text may vary widely. Therefore, it’s important to experiment with different techniques and evaluate the quality of the generated text using objective and subjective measures.

Tuning Hyper-parameters

The performance of the fine-tuned GPT-3 model will depend on several hyper-parameters, such as the learning rate, batch size, and number of epochs. You will need to experiment with different hyper parameter settings to optimize the model’s performance. Here are some guidelines for tuning the hyper-parameters:

Learning rate: The learning rate determines how quickly the model adapts to the training data. A higher learning rate can lead to faster convergence, but may also result in over-fitting or instability. You may need to experiment with different learning rates to find the optimal setting.

Batch size: The batch size determines how many examples are processed at once during each training iteration. A larger batch size can lead to faster training, but may also require more memory and processing power. You may need to adjust the batch size based on the available resources and training data size.

Number of epochs: The number of epochs determines how many times the model sees the training data during training.

Sequence length: This is the length of the input sequence that the model will process. A longer sequence length can capture more context, but it can also require more computational resources. A shorter sequence length can make the training process faster, but it may not capture enough context.

For your specific use case of fine-tuning GPT to write in the style of George R.R. Martin, I would recommend the following hyperparameters:

Batch Size: A value between 4 and 16 is a good starting point. So, we can start off with 4 and then continue to experiment on its multiples.

Learning Rate: A value between 1e-4 and 5e-5 should work well. You can experiment with different learning rates to find the best one for your specific task.

Number of Epochs: Start with a low number, such as 3–5, and increase gradually if needed. It’s important to monitor the loss on a validation set during training and stop when the loss stops improving.

Sequence Length: This is the length of the input sequences used for training. For text generation tasks, longer sequences generally lead to better results. However, longer sequences also require more memory and can slow down training. A value between 128 and 512 should work well.

Temperature: This is a hyperparameter used during text generation to control the creativity of the generated text. A value between 0.5 and 1.0 is a good starting point. Lower temperatures lead to more conservative, predictable text, while higher temperatures lead to more creative, unpredictable text.

It’s important to note that these hyperparameters are not set in stone and may need to be adjusted based on the specifics of your task and dataset. It’s always a good idea to experiment with different hyperparameters and monitor the results to find the best settings for your particular use case.

Preparing the Data

The next step involves preparing the data for training the GPT model. The three books by George R.R. Martin that you have should be scanned and converted into editable text. This text will then need to be preprocessed to remove any unnecessary characters or formatting. The resulting text should be split into smaller, more manageable chunks of text, such as paragraphs or sentences, that can be fed into the GPT model.

The final dataset should be in the format below [data/dataset.jsonl]

{“prompt”: “<prompt text>”, “completion”: “<ideal generated text>”}

…

Fine-tuning the GPT Model

Once the data is prepared, the next step is to fine-tune the GPT model. This involves feeding the preprocessed text into the GPT model and using it to generate text that is similar in style to George R.R. Martin’s writing. During the fine-tuning process, you will need to experiment with various hyperparameters such as the learning rate, batch size, and the number of training epochs to achieve the desired level of accuracy.

Technical Guidelines

Prerequisites:

An OpenAI API key with access to the GPT-3 model.
Python 3 installed on your system.
Basic knowledge of the command line and python.

Installation

In the terminal run the following:

pip install - upgrade openai

and

export OPENAI_API_KEY="<OPENAI_API_KEY>"

Data Preparation and dataset validation

openai tools fine_tunes.prepare_data -f data/dataset.jsonl

Finetune

openai api fine_tunes.create -t <TRAIN_FILE_ID_OR_PATH> -m <BASE_MODEL>

openai api fine_tunes.create -t data/prepared-dataset.jsonl -m curie

Note: training model can take minutes or hours depending on the model and dataset size

Other helpful commands

#If the event stream is interrupted for any reason, you can resume it by running

openai api fine_tunes.follow -i <YOUR_FINE_TUNE_JOB_ID>

# List all created fine-tunes

openai api fine_tunes.list

# Retrieve the state of a fine-tune. The resulting object includes

# job status (which can be one of pending, running, succeeded, or failed)

# and other information

openai api fine_tunes.get -i <YOUR_FINE_TUNE_JOB_ID>

# Cancel a job

openai api fine_tunes.cancel -i <YOUR_FINE_TUNE_JOB_ID>

Evaluating the Model

After the GPT model has been fine-tuned, the next step is to evaluate its performance. This involves generating new text using the fine-tuned model and comparing it with the original text by George to assess the accuracy of the model. You can also use various metrics such as perplexity, BLEU score, or human evaluation to measure the performance of the GPT model.

Iterating and Improving

Finally, based on the results of the evaluation, you can iterate and improve the GPT model further. This may involve fine-tuning the model again with different hyperparameters, adjusting the amount and quality of the training data, or incorporating new techniques such as transfer learning to improve the model’s accuracy.

FastAPI integration

After fine-tuning your GPT model, you may want to make it available for use in a production environment. One way to achieve this is by integrating it into a web API using a framework like FastAPI. With FastAPI, you can easily create an HTTP endpoint that takes input text as a request and returns the generated output from the GPT model as a response.

from fastapi import FastAPI
from pydantic import BaseModel
import openai

# initialize OpenAI API key and fine-tuned model ID
openai.api_key = "<YOUR_OPENAI_API_KEY>"
model_id = "<YOUR_FINE_TUNED_MODEL_ID>"

# create FastAPI instance
app = FastAPI()

# define input and output data models
class TextRequest(BaseModel):
    text: str

class TextResponse(BaseModel):
    text: str

# define API endpoint
@app.post("/generate-text", response_model=TextResponse)
async def generate_text(request: TextRequest):
    # generate completion from the fine-tuned model
    completion = openai.Completion.create(
        engine=model_id,
        prompt=request.text,
        max_tokens=1024,
        n=1,
        stop=None,
        temperature=0.7,
    )

    # return the generated text as a response
    return TextResponse(text=completion.choices[0].text)

In conclusion, teaching (fine-tuning) GPT-3 to mimic any author’s writing style needs dealing with large amounts of data and processing it efficiently can be a challenging task, but with the right tools and techniques, it can be made easier. In this conversation, we discussed how to split large JSON files into smaller chunks, merge them into a single JSONL file, and handle common issues that may arise during the process. We covered topics such as file input/output, string manipulation, error handling, and JSON parsing.

If you find this process overwhelming or if you need assistance with your data processing tasks, please feel free to contact me through my LinkedIn profile: https://www.linkedin.com/in/ashokpoudel/. I am always available to provide consultation and support for your data-related needs.

Fine-tuning GPT-3 to Write in the Style of George R.R. Martin (Or Any Author)

Overview

Preparing the Text Corpus

Digitizing the books

Cleaning and preprocessing the text

Chunking the text

Fine-tuning Methods

Chunk-based prompts

Keyword-based prompts

Random sampling

Tuning Hyper-parameters

Preparing the Data

Fine-tuning the GPT Model

Technical Guidelines

Prerequisites:

Installation

Other helpful commands

Evaluating the Model

Iterating and Improving

FastAPI integration

Written by Ashok Poudel

No responses yet