How to search the web with OpenAI API

Ever wondered how to make your OpenAI API applications smart enough to search the web? While OpenAI’s ChatGPT comes with built-in browsing capabilities, the OpenAI API doesn’t offer this out of the box. This can leave your applications without real-time information, which might disappoint your users.

But don’t worry! You can bridge this gap by integrating a web search API, like Google Serper Search, into your application’s backend. We’ll explain how you can harness search results and integrate them into your model’s prompt to create better, more current responses.

Key takeaways:
ChatGPT can browse the web but the OpenAI API does not have built-in web access. We can bridge this gap by integrating a web search API like Google Serper.
LangChain can help streamline the process of combining web search APIs with the OpenAI API, saving development time and effort.

Why can’t the OpenAI API access the web directly?

Let’s start with a basic question you might be wondering: “Doesn’t ChatGPT browse the web? Why can’t I do the same with the OpenAI API?” Here’s the breakdown. When used on OpenAI’s platform, ChatGPT can browse the web to fetch real-time information and provide up-to-date answers. This makes sense, as in a conversation, you’d expect the most current information, like today’s news or stock prices.

Note: This feature was a game-changer. When it was introduced, OpenAI’s CEO, Sam Altman, took to X (formerly Twitter) to highlight its impact. He simply wrote, "We are so back", showing how OpenAI had once again shifted the landscape.

However, the OpenAI API is a different story. While it gives you access to powerful language models, it doesn’t include built-in web browsing or search functionality. The models are trained on data available up to a specific cutoff date, so they don’t have information about events or updates that occur after that time.

Why is this the case? Mainly for security, efficiency, and control. Allowing direct web access from an API could introduce significant risks—like unintended data scraping, exposure to malicious content, or misuse of the service. By not providing direct internet access, OpenAI ensures that developers have full responsibility and control over how external data is fetched and used in their applications. However, this can also lead to several challenges:

Outdated information: Users seeking current facts or news won’t get the answers they need, which can be frustrating and diminish trust in your application.
Limited functionality: Applications that require up-to-date data, like news aggregators or financial tools, can’t function properly without real-time information.
Poor user experience: In today’s fast-paced world, users expect immediate and accurate responses. An AI that can’t provide that feels less intelligent and less useful.
Competitive disadvantage: Other applications leveraging real-time data will outshine yours, making it harder to retain users or stand out in the market.

However, here’s the good news: there are specialized search APIs dedicated to efficient web browsing. These services are optimized to perform web searches swiftly and securely. We’ll walk through the process of integrating these search APIs with the OpenAI API. As a refresher, please take a look at the following table before proceeding further.

The Differences Between ChatGPT and OpenAI API

Aspect	ChatGPT	OpenAI API
Definition	A conversational AI assistant developed by OpenAI, accessible via web interface or app.	A set of programmable interfaces allowing developers to access OpenAI's models programmatically.
Access Method	Accessed through OpenAI's website or the ChatGPT app; no coding required.	Accessed via API calls in code; requires programming knowledge.
Customization	Limited to predefined settings; users cannot fine-tune the model.	Allows customization, including fine-tuning models with custom data.
Rate Limits	Subject to usage limits, especially on the free tier.	Rate limits can be adjusted; higher limits available with increased usage and approval.
Data Privacy	Conversations may be stored and reviewed by OpenAI for policy compliance.	Developers can opt out of data logging; API data is not used to train models by default.
Fine-Tuning Capability	Not available to users.	Supports fine-tuning of certain models with custom data.

Which web search API should you choose and why?

When integrating web search capabilities into your OpenAI API application, choosing the right web search API is crucial. Several options are available, each with its own features, pricing, and documentation. It’s important to review the latest documentation for any API you’re considering to ensure it meets your specific needs. Two popular choices for web search APIs are Google Serper API and Tavily.

Google Serper API allows developers to perform Google searches in their programs. It provides structured JSON results from Google Search after a search query is given, including organic results, knowledge graphs, and more. This API is known for its reliability and the richness of its data, making it suitable for applications that require detailed search information.
Tavily is another web search API designed for efficiency and ease of use. It offers quick and straightforward search capabilities, providing essential data without unnecessary overhead. Tavily focuses on delivering fast responses, which can be beneficial for applications where performance is a priority.

When deciding which web search API to use, consider the following factors:

Features needed: Specify whether detailed search results with rich metadata are required, or if basic search snippets are sufficient.
Cost: Evaluate the pricing models to ensure they fit within your budget, considering both per-request costs and any monthly fees.
Ease of integration: Look for APIs that offer clear documentation and client libraries in your preferred programming language to simplify development.
Performance: Consider the API’s response times and reliability, especially if your application demands a quick turnaround.

By carefully assessing these aspects and consulting the most recent documentation, you can select the web search API that best aligns with your application’s goals and technical requirements.

What’s the easiest way to combine the search API with the OpenAI API?

You might be thinking, “This seems really difficult, how would I combine the two different APIs? Is there an easier way to do so rather than creating functions from scratch?” Absolutely! Creating functions from scratch can be a bit tedious and time-consuming. This is where LangChain comes into play.

In the above code:

Line 1: Imports the built-in os module, which provides a way of using operating system-dependent functionality. We’ll use it to access environment variables for our API keys.
Line 2: Imports the OpenAI class from LangChain’s language model module. This class allows us to interact with OpenAI’s language models seamlessly.
Line 3: Imports the GoogleSerperAPIWrapper utility from LangChain. This wrapper simplifies interactions with the Serper Google Search API, handling the API calls and parsing responses for us.
Line 4: Imports initialize_agent, a function to create and set up an agent with specified tools and configurations. Also, imports Tool, a class used to define tools (like web search functions) that the agent can use during its operation.
Line 5: Imports the AgentType enumeration, which lists different types of agents available in LangChain. We’ll specify the type when initializing our agent later on.
Line 6: Imports get_openai_callback which is used for tracking OpenAI API usage and costs.
Lines 9–10: Retrieves the OPENAI_API_KEY and SERPER_API_KEY from the environment variables and ensures it’s available in the os.environ dictionary for the LangChain to access.
Line 13: Creates an instance of the OpenAI language model. The llm variable now holds our language model instance, which we’ll use later to generate responses.
Line 16: Creates an instance of the GoogleSerperAPIWrapper class. The search variable now holds an instance of our web search tool. We’ll use it as a tool that the agent can call to fetch up-to-date information from Google Search.

Step 4: Create a tool for web search and initialize the agent

Create a tool for web search and initialize the agent:

In the above code:

Lines 1–7: We create a list called tools that contains one Tool object. The Tool is named "Search" and is linked to the search.run function, which performs web searches. The description informs the agent that this tool is useful for answering questions about current events or the current state of the world. This setup allows the agent to decide when to use the search.run function during its reasoning process.
Lines 9–14: We create an agent using the initialize_agent function from LangChain. The agent is configured with:
- Tools: The list of tools it can use (in this case, our web search tool).
- Language model: The llm instance representing our OpenAI language model.
- Agent type: AgentType.ZERO_SHOT_REACT_DESCRIPTION, which equips the agent with reasoning and action capabilities without needing prior examples.
- Verbose mode: Enabled to provide detailed output during the agent’s operation.
- Error handling: handle_parsing_errors=True allows the agent to gracefully handle any parsing errors.
- Max iterations: max_iterations=5 limits the number of reasoning steps to prevent infinite loops.

Step 5: Invoke the agent with a custom query and costing details

After initializing the agent, you can test it with a query:

In the above code, we’re running the agent with a user query and tracking API usage:

Lines 2–3: We define the query “What is Educative?" and print it.
Lines 5–14: We use a context manager to track OpenAI API usage and handle exceptions.
- with get_openai_callback() as cb: begins tracking token usage and costs.
- try attempts to run the agent with the query.
- result = react_agent.run(query) executes the agent to get the answer. The print statements in the block output the final result and token usage statistics.
- except Exception as e catches any exceptions and prints an error message if something goes wrong.

The complete code discussed in this Educative Answer can be found in the widget below. You can run your own custom queries by adding your API keys and modifying the query variable. This allows you to experiment and see the agent in action with different questions.

from langchain_community.utilities import GoogleSerperAPIWrapper
from langchain_openai import OpenAI
from langchain.agents import initialize_agent, Tool
from langchain.agents import AgentType
from langchain.callbacks import get_openai_callback
import os
# Set environment variables
os.environ["SERPER_API_KEY"] = "ADD YOUR API KEY HERE"
os.environ['OPENAI_API_KEY'] = "ADD YOUR API KEY HERE"
# Initialize OpenAI language model
llm = OpenAI(temperature=0)
# Initialize Google Serper API wrapper
search = GoogleSerperAPIWrapper()
# Define tools for the agent
tools = [
    Tool(
        name="Search",
        func=search.run,
        description="Useful for when you need to answer questions about current events or the current state of the world. Input should be a search query."
    )
]
# Initialize the agent with React agent type
react_agent = initialize_agent(
    tools, 
    llm, 
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True,
    handle_parsing_errors=True,
    max_iterations=5
)
# Run the agent with a query
query = "What is Educative?"
print(f"Query: {query}\n")
with get_openai_callback() as cb:
    try:
        result = react_agent.run(query)
        print(f"\nFinal Result: {result}")
        print(f"\nTotal Tokens: {cb.total_tokens}")
        print(f"Prompt Tokens: {cb.prompt_tokens}")
        print(f"Completion Tokens: {cb.completion_tokens}")
        print(f"Total Cost (USD): ${cb.total_cost}")
    except Exception as e:
        print(f"An error occurred: {e}")

By leveraging LangChain’s abstractions and utilities, we’ve created a sophisticated agent with relatively minimal code. That’s it! We have successfully integrated web search functionality with our OpenAI API! You can even change the query being passed in the above code to see how it works with other queries, too!

Why use an agent to enable web search for OpenAI API?

You might be thinking, “Why did we introduce an agent into the mix? Can’t we just call the web search function directly?” Well, consider the agent as the conductor of an orchestra—it’s not just about playing individual instruments (or functions) but about knowing when and how to use them to create harmonious results. By using an agent, we’re empowering our application to make intelligent decisions on-the-fly. The agent analyzes the user’s query, decides if it needs up-to-date information, and then determines whether to invoke the web search tool we’ve provided.

This approach makes our application more dynamic and efficient. Instead of hard-coding logic for every possible scenario, the agent autonomously figures out the best way to answer a question. It can break down complex queries into manageable sub-questions, leverage the language model’s knowledge, and utilize external tools as needed. In essence, using an agent allows our application to think more like a human—assessing what’s known, what’s unknown, and how to bridge that gap—all while keeping our code clean and maintainable.

What next?

As you integrate web search capabilities into your OpenAI API applications, there are a few best practices to consider.

Always handle your API keys securely. Store them as environment variables rather than hard-coding them into your scripts. This not only protects your sensitive information but also makes your codebase cleaner and more maintainable.
Be mindful of the usage policies and rate limits of OpenAI and web search APIs like Serper or Tavily. Check their documentation regularly for updates to ensure compliance.
Learn how to implement error handling in your code to gracefully manage exceptions and provide a smooth user experience.

By following these tips, you’ll build robust applications that deliver accurate, up-to-date information while adhering to best practices. We offer a dedicated Langchain course that you can check out to further elevate your skills with the library. Also, if you’re eager to dive deeper into the topic, we offer several Generative AI courses that can help you expand your knowledge and skills.

Frequently asked questions

Haven’t found what you were looking for? Contact Us

Can I use other web search APIs with OpenAI API?

Yes! You can use other web search APIs like Tavily, SerpAPI, or even build your own. The key is to ensure the API provides structured and useful search results.

Does LangChain support other APIs besides Google Serper or Tavily?

Yes, LangChain supports many third-party tools and APIs, including different search engines, databases, and other services.

Is there a cost to using LangChain?

LangChain is free to use, but you must account for the costs of the APIs it integrates, such as OpenAI and Serper.

Can LangChain agents handle multiple tools?

Absolutely! You can define multiple tools (e.g., web search, calculators, databases) and let the agent decide which one to use.

How do I manage the rate limits of both OpenAI and the search API?

Monitor API usage and adjust your application to avoid exceeding rate limits. Both OpenAI and web search APIs provide documentation on rate limits and best practices.

Free Resources

Learn in-demand tech skills in half the time

PRODUCTS

Mock Interview

New

Courses

Skill Paths

Projects

Assessments