Model Selection
Learn about the process of selecting appropriate LLMs for fine-tuning.
Choosing the right LLM for fine-tuning
When fine-tuning an LLM for our specific task, it’s crucial to consider various factors before selecting a model. Let’s look at the initial considerations.
Model size
LLMs are available in various sizes, and the size of a model directly affects its computational demands. Larger models typically offer better performance but require substantial computational power to operate. Depending on our requirements, we might opt for a smaller model like GPT-2, which has 124 million parameters and is more lightweight, or choose a more powerful option like Llama 2, which has 70 billion parameters and provides a higher level of performance.
Pretraining
The pretraining dataset forms the foundation of the model’s initial knowledge and greatly influences its understanding of user prompts and outputs. An LLM trained on a diverse and extensive dataset, such as internet text, will have a broad knowledge base, making it versatile across various topics. One such dataset is the Common Crawl dataset, created by archiving ...