Understanding Retrieval and Generative Models
Learn about AI's evolution from retrieval to generative models and how it relates to RAG.
We’ve entered a world where our computers are getting better and better at chatting with us. Initially, these machines were pretty basic; they followed strict rules, like following a recipe to the last letter when cooking. If we asked for a cookie, we got a cookie, as long as the recipe specified that it was indeed a cookie recipe.
However, as time passed, things got more interesting. We started teaching these machines not just to follow recipes but to cook something new. This leap was thanks to some highly effective ideas in building their brains—the “models,” as we call them. Now, two main types of models spice things up.
First, we have retrieval models such as Term Frequency-Inverse Document Frequency (TF-IDF) or Best Match 25 (BM25). Imagine a well-organized librarian who knows exactly where every book is placed and which ones contain the information you need. These models are adept at sifting through vast amounts of data to find and retrieve the most relevant information for the task at hand. They help by pulling data from a knowledge database to provide context or facts necessary for generating accurate responses or insights.
Then, we have the generative models like the generative pre-trained transformer (GPT). These are the Leonardo da Vinci of the AI world. Rather than merely retrieving information, they use their training to generate new content creatively. Given a prompt, they can compose a comprehensive answer, or given a hint of an idea, they can expand it into a detailed narrative.
So, as we go about making these machines smarter, we’re teaching them to both retrieve crucial information and create new wonders, pushing the boundaries of what we used to think was possible. However, before we truly understand what retrieval-augmented generation is, we need to take a closer look at the two models we discussed.
What are retrieval models?
Retrieval models are specialized in navigating through vast amounts of data to find information that is relevant to a specific query or context. Unlike models that categorize or classify data based on learned examples, retrieval models focus on the precision of matching query criteria with the data they have access to. For example, in a question-answering system, a retrieval model sifts through a database of information to fetch details that best answer the user’s question.
In text and image processing, these models play a crucial role. For text, they analyze the content within a large corpus and identify passages that most closely relate to the query at hand. Similarly, in image retrieval, these models analyze visual content, enabling the search and retrieval of images that are most relevant to a given query. This can involve recognizing objects, colors, patterns, or even scenes within a collection of images. The effectiveness of retrieval models lies in their ability to accurately pull from the right sources, whether textual or visual, ensuring that the generative models that follow are working with the most pertinent and contextually appropriate information.
One of the primary strengths of retrieval models is their efficiency in handling and extracting useful data from large datasets or databases. They enable systems to scale by managing vast amounts of information without the need to deeply understand or generate new content independently. However, retrieval models rely heavily on the quality and structure of the data they access; their performance is contingent on the relevance and accuracy of the information stored in the databases they query.
What are generative models?
Generative models describe how data is generated by learning the joint probability distribution of “input features” and “output labels.” Unlike retrieval models, which focus on identifying and fetching relevant information from large datasets, generative models attempt to understand and replicate the underlying data generation process.
For instance, in image generation, a generative model can learn from a collection of animal photos and subsequently generate new, unique images of realistic yet distinct animals from any specific animal in the training dataset. This capability arises from the model’s understanding of general features such as textures, shapes, and colors that define “animal-ness” rather than merely distinguishing between predefined categories like “cats” and “dogs.”
As we can see, unlike retrieval models that search and identify relevant data from a large corpus for a given query, generative models learn the probability distribution of the data features, enabling them to produce new examples. However, these models can sometimes yield unexpected outputs, such as generating a hybrid image like a whale with rabbit ears.
Similarly, generative models can compose coherent and contextually relevant text in natural language processing. ChatGPT, a variant of the GPT model, is a prime example. After training on large text corpora, these models can produce new sentences, paragraphs, or even entire articles that resemble the style and content of the training material. The models achieve this by learning the linguistic structures, vocabulary usage, and stylistic elements present in the training text.
Generative models vs. retrieval models
While it may seem that generative models are superior in many ways, they are not necessarily better than retrieval models in every aspect. Generative models come with several limitations when compared to their retrieval counterparts, mainly due to differences in their underlying approaches and objectives. Consider the following key points:
Complexity and computational cost: Generative models often involve learning the joint probability distribution of inputs and outputs, which can be computationally complex and resource-intensive. This complexity can result in longer training times and higher computational demands, especially with high-dimensional data.
Precision in specific information retrieval: For tasks where precision in retrieving specific information is paramount, retrieval models usually provide better performance. This is because retrieval models are designed to fetch and provide the most relevant data without the need to generate new content. They focus on the accuracy of the information presented, which is crucial in applications like question answering and data extraction.
Transparency: Generative models function as black boxes, making it challenging to comprehend how specific outputs are produced. This lack of transparency can be problematic in applications that require explainability. In contrast, retrieval models are typically more transparent, as the process of matching queries to documents is more straightforward and interpretable. Users can easily trace the retrieved information back to its source.
Can these models be combined?
The use of retrieval and generative models is not mutually exclusive; in fact, combining these models can harness the strengths of each to enhance overall performance in certain applications. Retrieval-augmented generation (RAG) offers a fascinating glimpse into how the concept of both retrieval and generative models can be integrated, though in a subtle manner.
Rather than explicitly combining the two concepts within a single model, RAG leverages the strengths of both retrieval and generation to achieve more efficient outcomes. RAG uses generative capabilities to produce new content or responses based on learned patterns and contextual understanding. Simultaneously, it incorporates retrieval mechanisms by accessing a knowledge base to provide relevant information that informs and enhances the generation process. This dual approach allows RAG to generate more accurate and contextually relevant outputs.
Educative Byte: The integration of retrieval models with generative models in RAG systems mirrors human cognitive processes. When we create something new, our brains often retrieve pieces of related information from memory, which we then combine and transform into new ideas.