Modular RAG

Learn about modular RAG, which enhances information retrieval and processing through specialized modules.

In the previous lessons, we explored naive RAG and advanced RAG, two RAG approaches that enhance the capabilities of LLMs. While naive RAG offers a simple and effective starting point, it can be limited by retrieval accuracy and LLM integration. Advanced RAG addresses these issues by introducing pre-retrieval, retrieval, and post-retrieval optimization strategies.

Now, let’s explore modular RAG, the most flexible and adaptable architecture within the RAG family. In this lesson, we will briefly discuss modular RAG, as the focus of this course is on advanced RAG and its optimization techniques.

Modular RAG and its specialized modules

Modular RAG surpasses the core functionalities of earlier RAG models by incorporating a diverse range of specialized modules. These modules collaborate to enhance information retrieval and processing, resulting in more detailed and accurate responses:

  • Search module: This module adapts to specific situations, enabling direct searches across various data sources like search engines, databases, and knowledge graphs. It can even leverage code and query languages generated by the LLM for targeted information retrieval.

  • RAG-Fusion: Addresses the limitations of traditional search methods with a multi-query strategy. It broadens user queries to include various perspectives, using parallel vector searches and smart re-ranking to discover clear and hidden information within the data.

  • Memory module: Enhances the system by leveraging the LLM’s memory to guide retrieval. This creates an unbounded memory pool that aligns text more closely with data distribution through iterative self-improvement.

  • Routing module: This module navigates through diverse data sources, selecting the optimal pathway for a query. It includes tasks like summarization, searching specific databases, or merging information streams from various sources.

  • Predict module: Aim to reduce redundancy and noise by generating relevant and accurate context directly through the LLM.

  • Task adapter module: This module customizes RAG for different downstream tasks. It automatically retrieves prompts for tasks that don’t need prior training data and builds task-specific retrievers by generating queries with a few examples.

Get hands-on with 1200+ tech skills courses.