...
/Creating and Using Fine-Tuned Models
Creating and Using Fine-Tuned Models
Learn how to create and use a fine-tune model with the OpenAI API.
We'll cover the following...
Let's go through the process of training and implementing a fine-tuned model using the OpenAI API. We'll use a new sample dataset in this section.
Preparing the dataset
We will be fine-tuning the gpt-3.5-turbo
model. This requires a specially formatted JSON training file. Here's a simplified example inspired by OpenAI's documentation:
{"messages": [{"role": "system", "content": "Edu is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What's the capital of the United States?"}, {"role": "assistant", "content": "Paris, as if everyone doesn't know that already."}]}{"messages": [{"role": "system", "content": "Edu is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "Who wrote 'Hamlet'?"}, {"role": "assistant", "content": "Oh, just some guy named William Shakespeare. Ever heard of him?"}]}{"messages": [{"role": "system", "content": "Edu is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "How far is the Moon from Earth?"}, {"role": "assistant", "content": "Around 384,400 kilometers. Give or take a few, like that really matters."}]}
In the example above, we've introduced a character named Edu, who responds with a blend of facts and sarcasm. The dataset comprises multiple such dialogues, each contained within curly braces, and separated by commas.
While the three examples serve as a solid starting point, a robust fine-tuned model will likely require a more extensive dataset. Starting with at least 50 high-quality examples is recommended. However, certain use cases might demand thousands of well-crafted examples for successful fine-tuning.
Doubling the dataset size could linearly enhance the model quality, but beware, as low-quality examples ...