Getting Started with AI Agents

AI agents are anticipated to be pivotal in the coming decade, transforming how we interact with technology. To understand AI agents, it's essential to differentiate them from traditional AI models and compound systems.

What are traditional models?

Traditional AI models are systems trained on specific datasets to perform a particular task or set of tasks. These models are like highly specialized tools—excellent at what they’ve been trained to do but limited in adapting to new or unforeseen scenarios without significant retraining. They generate responses based on patterns learned from their training data but don’t have the flexibility to incorporate new information or adjust their behavior dynamically without extensive reprogramming. We can think of them as skilled craftsmen who excel in their trade using familiar tools and materials but struggle to adapt quickly when faced with new challenges or unfamiliar resources.

For example, imagine you’re organizing a large conference and need to find the best venue. If you asked a traditional AI model designed for event planning from a few years ago, it might provide general advice based on past data. However, it wouldn’t be able to consider your specific requirements, access real-time information about venue availability, or account for the latest trends in event hosting without being extensively updated.

Press + to interact
A general response provided by the traditional model
A general response provided by the traditional model

Nevertheless, these models are quite good at various tasks, such as summarizing documents, drafting emails, and generating reports. But here’s where things get interesting: The magic happens when we integrate these models into larger, more complex systems, such as customer service platforms, supply chain management tools, or personalized recommendation engines. In these setups, individual models work together, complementing each other’s strengths to tackle more sophisticated challenges and adapt to a wider range of scenarios. These are what we call compound AI systems.

What are compound systems?

As the name suggests, a compound system consists of multiple interconnected components, each specialized for a particular function. Let’s go back to our conference example. To get a precise answer about the best venue, we’d need to connect our AI model to various data sources: databases of available venues, pricing information, user reviews, and even local traffic conditions. The AI model would generate a query to fetch this information, combine it with its own processing, and then present a coherent response. This system is modular, meaning we can mix and match various components: large language models, image generation models, database search tools, and more. Each component does its part to solve a bigger problem more effectively than a single traditional model could.

Press + to interact
A more precise response yet not completely reliable
A more precise response yet not completely reliable

Educative Byte: Why not just tweak a single model to do all this? It’s much simpler and quicker to design a system with different parts that can be swapped out or updated as needed rather than retraining a whole model from scratch every time we need a change.

This brings us to one of the most widely used compound systems: retrieval-augmented generation (RAG). RAG exemplifies a compound AI system because it leverages the strengths of both retrieval and generation components to provide more accurate and contextually relevant responses. RAG systems combine the power of language models with precise information retrieval, and they can be configured to search various types of databases, including real-time data sources, depending on how they are set up. However, even with this flexibility, RAG systems typically follow predefined retrieval and generation patterns. This is where AI agents come into play, offering a more dynamic and adaptive approach to complex problem-solving.

Note: If you're interested in diving deeper into RAG and exploring how it can be configured to access various data sources, including real-time information, we recommend checking out some of the specialized courses we offer on this topic. These courses provide a more comprehensive understanding of RAG systems and their applications:

Fundamentals of Retrieval-Augmented Generation with LangChain
Advanced RAG Techniques: Choosing the Right Approach
Building Multimodal RAG Applications with Google Gemini

What are AI agents?

Now, let’s talk about AI agents. In many ways, AI agents can be seen as an advanced form of compound systems. They still rely on the integration of multiple specialized components, but what sets them apart is their enhanced level of autonomy and adaptability. Instead of just following predefined rules or relying on static models, AI agents leverage the reasoning capabilities of foundation models to make decisions dynamically.

The key difference is not in the type of system, but in how it operates. While compound systems typically follow a set sequence of operations, AI agents have the ability to plan, adapt, and adjust their approach based on the context and the specific needs of the task. This makes them more flexible and capable of handling complex, evolving problems.Think of it this way: On one end of the spectrum, we have systems that think fast and follow strict instructions without deviation. On the other end, we have agentic systems that think more slowly, carefully planning each step and adjusting their approach based on new information. This makes them incredibly adaptable.

For example, if you're looking for the best venue for a conference, a compound system might gather information from various sources to provide a list of options. An AI agent, however, could go further by factoring in recent user preferences, changes in venue availability, or even local events during your conference dates. It can adapt on the fly, making it more effective in dynamic environments. While both AI agents and compound systems are powerful, AI agents represent the next step—they are compound systems with a "brain" that can reason, learn, and adapt, managing tasks with greater autonomy and intelligence.

Press + to interact

What are the key capabilities of AI agents?

AI agents have three key capabilities: reasoning, acting, and memory. These capabilities enable AI agents to handle complex tasks by breaking them down into manageable steps, utilizing external tools, and leveraging stored information for personalized interactions.

  • Reasoning: Reasoning involves the AI agent creating a plan and logically thinking through each step needed to solve a problem. For example, when tasked with finding a suitable venue for a conference, the agent would start by identifying the key requirements, such as the date, number of attendees, and location preferences.

  • Acting: Acting refers to the AI agent's ability to use external tools to gather information and execute tasks. For example, in the conference venue example, the agent might use search engines to look up potential venues, calculators to compare costs, and databases to check availability. The model defines these tools and calls upon them as needed to perform specific actions.

  • Memory: Memory allows the AI agent to store and retrieve information from previous interactions or inner logs. This can include remembering user preferences or specific requirements mentioned earlier in the conversation. For instance, if the user previously indicated a preference for venues with on-site catering, the agent can recall this detail and prioritize such venues in its search.

The agentic workflow can be visualized in the following diagram:

Press + to interact
Agentic workflow
Agentic workflow

In the above diagram, the process starts with a User Query, which is the initial question or problem posed by the user. The AI agent then proceeds to Plan (Reason), formulating a step-by-step strategy to address the query. The agent then moves to Action (Use Tools), utilizing various external tools such as search engines or calculators to gather necessary information and perform tasks. The Observe step involves the agent reviewing the results of its actions, and if needed, it may adjust its plan accordingly, looping back to the reasoning stage. Finally, the process concludes with the Final Output, where the agent provides a comprehensive and tailored response to the user.

A popular approach to configuring these agents is called Reasoning and Acting (ReAct), which we will look at in the coming lessons. For example, if you want to find out how many seats you’ll need for each conference session, the agent would consider the number of attendees, room sizes, and seating arrangements and then perform the necessary calculations. This modular approach lets the agent solve complex problems efficiently.

Press + to interact
An agentic system provides us with a precise response
An agentic system provides us with a precise response

What are the main types of AI agents?

AI agents come in various forms, each designed to operate in specific environments and tackle unique problems. These agents vary in complexity, capabilities, and applications, ranging from simple rule-based systems to advanced learning agents that adapt over time. Understanding the different types of AI agents can help you choose the right one for your needs and leverage their strengths effectively. Below are some of the primary types of AI agents:

Type of Agent

Description

Example

Simple Reflex Agents

Operate based on predefined rules and immediate data without memory or learning capabilities. Suitable for straightforward tasks with well-defined conditions.

AI vacuum cleaner (e.g., Roomba)

Model-Based Reflex Agents

Maintain an internal model of the world, allowing them to evaluate the effects of their actions before deciding.

Tesla's obstacle avoidance system

Goal-Based Agents

Use reasoning and planning to achieve specific goals, focusing on outcomes rather than optimal paths.

GPS route planning (e.g., Google Maps)

Utility-Based Agents

Maximize desired outcomes by comparing scenarios and selecting the option with the highest rewards, considering both goal achievement and desirability.

Netflix recommendation system

Learning Agents

Continuously improve performance by learning from experiences and adapting behavior over time.

Gmail spam filter

Hierarchical Agents

Organized in tiers, with higher-level agents delegating tasks to lower-level agents, ensuring efficient coordination and goal achievement.

Factory automation system

The agents mentioned above can be categorized as collaborative if they are designed to work together with other agents or systems to achieve a shared goal. The key characteristic of collaborative agents is their ability to communicate, coordinate, and cooperate with one another. They might share information, divide tasks, or synchronize their actions to accomplish something that would be difficult or impossible for a single agent. It is not to be confused with the concept of multi-agent systems.

A multi-agent system is a broader concept of a system composed of multiple interacting agents. These agents can be collaborative, but they might also be competitive or operate independently. A multi-agent system is concerned with the overall structure, architecture, and dynamics of the system as a whole.

What are the limitations of using AI agents?

While AI agents show a lot of promise, they are still a relatively new type of system with several challenges, such as:

  • Data privacy concerns: AI agents often require access to vast amounts of data, including sensitive personal information. This raises concerns about data privacy and security.

  • Technical complexities: Developing and deploying AI agents requires advanced technical expertise, including knowledge of AI, machine learning, and software engineering.

  • Adaptability in unforeseen circumstances: AI agents are often designed for specific tasks or environments. They may struggle to generalize or adapt to new, unforeseen scenarios without significant retraining or reconfiguration.

  • Limited compute resources: AI agents, particularly those that rely on complex models like deep learning, require substantial computational resources, including processing power, memory, and storage.

  • Ethical concerns: AI agents can inadvertently perpetuate or even amplify biases present in the data they are trained on. Additionally, their decision-making processes may lack transparency, leading to ethical concerns.

What's next?

As we move forward, we’re seeing more and more compound AI systems, especially agentic ones. For simpler, well-defined problems, a programmatic approach might still suffice. However, agentic systems offer better adaptability and efficiency for more complex tasks. We’re still in the early days of AI agents, but the progress is rapid. Combining system design with agentic behavior allows us to create powerful tools that can handle various tasks. And while human oversight is still important for now, the accuracy and capabilities of these agents are continually improving.