...
/Retrieval Strategies: Common Document Loaders
Retrieval Strategies: Common Document Loaders
Learn how to use LangChain offline document loaders.
Document loaders
Document loaders in LangChain play an essential role in integrating diverse data sources into chatbot frameworks and other AI applications.
These tools facilitate the import and processing of structured and unstructured data from an array of document types, including CSV, JSON, text, Microsoft Office formats, and PDFs. By providing easy access to such a variety of data inputs, LangChain expands the potential use cases for developers.
Types of document loaders
We will now experiment with a few of the many LangChain document loaders.
CSV loader
The code below demonstrates how to load and display data from comma-separated values (CSV) files using a specific loader from the langchain_community
library. CSV files are a common format for storing tabular data, where each line represents a row in the table, and fields are separated by commas. This allows users to load CSV databases and analyze them using chatbots and natural language.
Run the below code to try the module:
# Import librariesfrom langchain_community.document_loaders import CSVLoader# Define the path to the file and load the filefile_path = '/usercode/GDP.csv'loader = CSVLoader(file_path=file_path)data = loader.load()# Print the datafor content in data:print(content.page_content)print('-'*80)
In this code, we perform the following steps:
Lines 1–2: We import the
CSVLoader
from thelangchain_community.document_loaders.csv_loader
module.Lines 4–7: We define the path to the required file, and we use the
CSVLoader
to load the file. We useloader.load()
to return the loaded data, which is then stored in the variable data.Lines 9–12: We iterate through the loaded data, and we print the page content.
JSON loader
The code below demonstrates how to load and display data from JSON (JavaScript Object Notation) files, specifically using the JSONLines
format. JSON is a widely used format for storing and ...