...

/

Composition Strategies: Chain Constructor Implementation

Composition Strategies: Chain Constructor Implementation

Learn how to implement chains using LangChain.

We'll cover the following...

In this lesson, we will learn how to develop a chatbot that responds to a user query using methods and functions that we have learned so far. We’ll use a RAG system with LangChain to load, tokenize, and embed text, split the tokens, upload them to a vector store, set up a retriever, create a prompt using a prompt template, set up an LLM model with state-of-the-art models, implement piping chains, and generate a response to the user’s query.

The coding process

Let's understand the code before we dive into the actual implementation.

Press + to interact
# Suppress warnings
import warnings
warnings.filterwarnings("ignore", category=UserWarning)
warnings.filterwarnings("ignore", category=FutureWarning)
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
# Disable parallelism
os.environ["TOKENIZERS_PARALLELISM"] = "false"
# Set Groq API Key
groq_api_key = os.environ["GROQ_API_KEY"]

In this code, we perform the following steps:

  • Lines 1–10: Various warnings are suppressed, and the environment variable for the LLM Groq API key is set. This key will be used later to access the AI model.

Press + to interact
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough, RunnableParallel
from langchain_core.output_parsers import StrOutputParser
from langchain_community.document_loaders import WebBaseLoader
from langchain_community.document_transformers import BeautifulSoupTransformer
from langchain_groq import ChatGroq
import re

In this code, we perform the following steps:

  • Line 1: RecursiveCharacterTextSplitter is used to split our text into manageable chunks.

  • Line 2: HuggingFaceEmbeddings generates embeddings for our text chunks to facilitate semantic understanding and retrieval operations.

  • Line 3: FAISS manages a vector store that allows for efficient retrieval of text based on similarity, which is essential for handling user queries and providing relevant information.

  • Line 4: ChatPromptTemplate allows us to format our text-engineered prompt into a template that our LLM model can understand and follow.

  • Line 5: RunnablePassthrough and RunnableParallel are used to pipe in commands in an LLM constructor chain.

  • Line 6: StrOutputParser is used for parsing and formatting our LLM response.

  • Line 7: WebBaseLoader: is used to fetch content from a specified URL.

  • Line 8: BeautifulSoupTransformer is applied to extract readable text from the loaded HTML content.

  • Line 9: ChatGroq serves as the interface to Groq’s powerful language models, enabling the chatbot to generate responses based on the embedded and retrieved text data.

  • Line 10: re (regular expressions) is used to clean and format the text by removing HTML-like tags and unnecessary backslashes, reducing multiple quotes and white space.

Press + to interact
# Define the website to load
url = "https://elnion.com/2024/05/01/generative-ai-poised-for-takeoff-but-organisational-transformation-holds-the-key/"
# Create an instance of WebBaseLoader and load the web page
loader = WebBaseLoader(url)
webpage = loader.load()
# Use Beautiful Soup to extract the page content
bs_transformer = BeautifulSoupTransformer()
docs_transformed = bs_transformer.transform_documents(webpage)
docs_content = docs_transformed[0].page_content
...