Enabling complex generative AI applications with Amazon Bedrock Agents

1 week ago

In June, I started a series of posts that highlight the key factors that are driving customers to choose Amazon Bedrock. The first covered building generative AI apps securely with Amazon Bedrock, while the second explored building custom generative AI applications with Amazon Bedrock. Now I’d like to take a closer look at Amazon Bedrock Agents, which empowers our customers to build intelligent and context-aware generative AI applications, streamlining complex workflows and delivering natural, conversational user experiences. The advent of large language models (LLMs) has enabled humans to interact with computers using natural language. However, many real-world scenarios demand more than just language comprehension. They involve executing complex multi-step workflows, integrating external data sources, or seamlessly orchestrating diverse AI capabilities and data workflows. In these real-world scenarios, agents can be a game changer, delivering more customized generative AI applications—and transforming the way we interact with and use LLMs.

Table of Contents

Toggle

Answering more complex queries

Amazon Bedrock Agents enables a developer to take a holistic approach in improving scalability, latency, and performance when building generative AI applications. Generative AI solutions that use Amazon Bedrock Agents can handle complex tasks by combining an LLM with other tools. For example, imagine that you are trying to create a generative AI-enabled assistant that helps people plan their vacations. You want it to be able to handle simple questions like “What’s the weather like in Paris next week?” or “How much does it cost to fly to Tokyo in July?” A basic virtual assistant might be able to answer those questions drawing from preprogrammed responses or by searching the Internet. But what if someone asks a more complicated question, like “I want to plan a trip to three countries next summer. Can you suggest a travel itinerary that includes visiting historic landmarks, trying local cuisine, and staying within a budget of $3,000?” That is a harder question because it involves planning, budgeting, and finding information about different destinations.

Using Amazon Bedrock Agents, a developer can quickly build a generative assistant to help answer this more complicated question by combining the LLM’s reasoning with additional tools and resources, such as natively integrated knowledge bases to propose personalized itineraries. It could search for flights, hotels, and tourist attractions by querying travel APIs, and use private data, public information for destinations, and weather—while keeping track of the budget and the traveler’s preferences. To build this agent, you would need an LLM to understand and respond to questions. But you would also need other modules for planning, budgeting, and accessing travel information.

Agents in action

Our customers are using Amazon Bedrock Agents to build agents—and agent-driven applications—quickly and effectively. Consider Rocket, the fintech company that helps people achieve home ownership and financial freedom:

“Rocket is poised to revolutionize the homeownership journey with AI technology, and agentic AI frameworks are key to our mission. By collaborating with AWS and leveraging Amazon Bedrock Agents, we are enhancing the speed, accuracy, and personalization of our technology-driven communication with clients. This integration, powered by Rocket’s 10 petabytes of data and industry expertise, ensures our clients can navigate complex financial moments with confidence.”

– Shawn Malhotra, CTO of Rocket Companies.

A closer look at how agents work

Unlike LLMs that provide simple lookup or content-generation capabilities, agents integrate various components with an LLM to create an intelligent orchestrator capable of handling sophisticated use cases with nuanced context and specific domain expertise. The following figure outlines the key components of Amazon Bedrock Agents:

The process starts with two parts—the LLM and the orchestration prompt. The LLM—often implemented using models like those in the Anthropic Claude family or Meta Llama models—provides the basic reasoning capabilities. The orchestration prompt is a set of prompts or instructions that guide the LLM when driving the decision-making process.

In the following sections, we discuss the key components of Amazon Bedrock Agents in depth:

Planning: A path from task to goals

The planning component for LLMs entails comprehending tasks and devising multi-step strategies to address a problem and fulfill the user’s need. In Amazon Bedrock Agents, we use chain-of-thought prompting in combination with ReAct in the orchestration prompt to improve an agent’s ability to solve multi-step tasks. In task decomposition, the agent must understand the intricacies of an abstract request. Continuing to explore our travel scenario, if a user wants to book a trip, the agent must recognize that it encompasses transportation, accommodation, reservations for sightseeing attractions, and restaurants. This ability to split up an abstract request, such as planning a trip, into detailed, executable actions, is the essence of planning. However, planning extends beyond the initial formulation of a plan, because during execution, the plan may get dynamically updated. For example, when the agent has completed arranging transportation and progresses to booking accommodation, it may encounter circumstances where no suitable lodging options align with the original arrival date. In such scenarios, the agent must determine whether to broaden the hotel search or revisit alternative booking dates, adapting the plan as it evolves.

Memory: Home for critical information

Agents have both long-term and short-term memory. Short-term memory is detailed and exact. It is relevant to the current conversation and resets when the conversation is over. Long-term memory is episodic and remembers important facts and details in the form of saved summaries. These summaries serve as the memory synopses of previous dialogues. The agent uses this information from the memory store to better solve the current task. The memory store is separate from the LLM, with a dedicated storage and a retrieval component. Developers have the option to customize and control which information is stored (or excluded) in memory. An identity management feature, which associates memory with specific end-users, gives developers the freedom to identify and manage end-users—and enable further development on top of Amazon Bedrock agents’ memory capabilities. The industry-leading memory retention functionality of Amazon Bedrock—launched at the recent AWS New York Summit—allows agents to learn and adapt to each user’s preferences over time, enabling more personalized and efficient experiences across multiple sessions for the same user. It is straightforward to use, allowing users to get started in a single click.

Communication: Using multiple agents for greater efficiency and effectiveness

Drawing from the powerful combination of the capabilities we’ve described, Amazon Bedrock Agents makes it effortless to build agents that transform one-shot query responders into sophisticated orchestrators capable of tackling complex, multifaceted use cases with remarkable efficiency and adaptability. But what about using multiple agents? LLM-based AI agents can collaborate with one another to improve efficiency in solving complex questions. Today, Amazon Bedrock makes it straightforward for developers to connect them through LangGraph, part of LangChain, the popular open source tool set. The integration of LangGraph into Amazon Bedrock empowers users to take advantage of the strengths of multiple agents seamlessly, fostering a collaborative environment that enhances the overall efficiency and effectiveness of LLM-based systems.

Tool Integration: New tools mean new capabilities

New generations of models such as Anthropic Claude Sonnet 3.5, Meta Llama 3.1, or Amazon Titan Text Premier are better equipped to use reources. Using these resources requires that developers keep up with ongoing updates and changes, requiring new prompts every time. To reduce this burden, Amazon Bedrock simplifies interfacing with different models, making it effortless to take advantage of all the features a model has to offer. For example, the new code interpretation capability announced at the recent AWS New York Summit allows Amazon Bedrock agents to dynamically generate and run code snippets within a secure, sandboxed environment to address complex tasks like data analysis, visualization, text processing, and equation solving. It also enables agents to process input files in various formats—including CSV, Excel, JSON—and generate charts from data.

Guardrails: Building securely

Accuracy is critical when dealing with complex queries. Developers can enable Amazon Bedrock Guardrails to help reduce inaccuracies. Guardrails improve the behavior of the applications you’re building, increase accuracy, and help you build responsibly. They can prevent both malicious intent from users and potentially toxic content generated by AI, providing a higher level of safety and privacy protection.

Amplifying and extending the capabilities of generative AI with Amazon Bedrock Agents

Enterprises, startups, ISVs, and systems integrators can take advantage of Amazon Bedrock Agents today because it provides development teams with a comprehensive solution for building and deploying AI applications that can handle complex queries, use private data sources, and adhere to responsible AI practices. Developers can start with tested examples—so-called golden utterances (input prompts) and golden responses (expected outputs). You can continuously evolve agents to fit your top use cases and kickstart your generative AI application development. Agents unlock significant new opportunities to build generative AI applications to truly transform your business. It will be fascinating to see the solutions—and results—that Amazon Bedrock Agents inspires.