Agent execution Agent registry and catalog Multi-agent coordination Agent quality and safety Access control and authentication

Agents layer

The Agents layer serves as the central coordination hub for interactions between users, foundation models, tools, and knowledge sources. This layer contains the agent runtime environments, orchestration mechanisms, and supporting infrastructure that enable AI agents to function. Agents use LLMs for reasoning and planning, call tools to perform operations, retrieve information from knowledge bases, and maintain memory of past interactions.

Agent execution

Agents have specific requirements that set them apart from the traditional application and microservices. Agents can run in autonomy for hours, need to retain context and at the same time be fully isolated from other agent instances. Agents need to securely discover, select and access tools and other agents via well-known protocols like MCP and A2A. Agents also require services to persist conversation across executions, and secure ways to execute code and interact with browser and web-based systems.

Organizations can implement these capabilities using:

Amazon Bedrock AgentCore runtime – a foundational service for executing agents securely at scale with no infrastructure management needed. The runtime provides enterprise-grade security and dynamic scaling, session persistence and isolation, large payloads, multi-protocol support and bidirectional streaming.
Amazon Bedrock AgentCore memory – provides persistence for both short-term conversation history and long-term extracted insights.
Amazon Bedrock AgentCore identity – ensures seamless integration of authentication with IAM and OAuth providers for both inbound and outbound.
AWS Lambda – provides serverless computing for custom agent logic and tool implementations. Provides execution for lightweight agent operations and tool invocations.
Amazon ECS or AWS Fargate – provides container-based agent deployments for complex requirements, stateful operations, or resource-intensive agent workloads requiring dedicated compute environments.

Agent registry and catalog

A centralized agent registry maintains an inventory of deployed agents, capturing:

Agent capabilities, permissions, and purpose
Metadata on ownership, version history, and dependencies
Performance metrics and usage statistics
Approval status and governance classifications

The registry supports discovery, reuse, and governance while preventing unnecessary duplication of capabilities across the organization. Registries should support self-service access requests and authorization delegation to the tool or agent owners. Registries should integrate with MCP gateway access controls to enforce the required access policies. Organizations should maintain registries documenting approved agents with quality control and lifecycle management.

Multi-agent coordination

As multi-agent systems become prevalent, architectures must support agent-to-agent communication through standardized protocols such as MCP and A2A, message formatting and state sharing conventions, appropriate state isolation, authentication and authorization between collaborating agents, and permissions verification for delegated actions.

Supporting infrastructure includes:

Amazon Bedrock AgentCore memory – managed memory service for storing agent state, conversation history, and long-term extracted insights that can be shared across agent sessions with fine-grained access control and storage isolation
Amazon EventBridge – event-driven backbone for multi-agent messaging
AWS Step Functions – orchestration for complex multi-agent workflows with checkpoints and error recovery
Amazon DynamoDB – fast, scalable storage for agent state and shared memory

Agent quality and safety

As agents become more autonomous and handle critical business operations, architectures must support comprehensive quality assurance and safety mechanisms through evaluation frameworks for reliability and accuracy, safety testing for harmful outputs or behaviors, performance monitoring and regression detection, and feedback loops for continuous improvement.

Supporting infrastructure includes:

Amazon Bedrock Guardrails – manages capabilities for content filtering, denied topics, word filters, and sensitive information redaction
Amazon Bedrock AgentCore Evaluations – built-in evaluation frameworks providing automated assessment tools to measure how well agents or tools perform specific tasks, handle edge cases, and maintain consistency across different inputs and contexts.
Amazon CloudWatch – helps you monitor the metrics of your AWS resources and the applications you run on AWS in real time.
Amazon Bedrock Evaluations– evaluate LLMs and semantic retrieval systems using programmatic metrics, human evaluators and LLM-as-judge.
AWS Lambda - Serverless execution of custom validation logic and safety checks
Amazon S3 - Storage for evaluation datasets, test results, and safety audit logs
Open-source observability tools - Platforms like LangFuse for tracing, evaluation, prompt management, and metrics; deployable on AWS

Access control and authentication

Effective agent operations require robust identity and permission management through identity propagation from users through agent chains, permission boundaries for agent actions and tool access, audit trails of agent decisions and actions, and circuit breakers for abnormal behavior patterns.

Supporting infrastructure includes:

Amazon Bedrock AgentCore Identity – integrated identity management and authentication context propagation across agent interactions
Amazon Bedrock AgentCore Gateway Interceptors – Lambda based request and response interceptor to evaluate, filter, manipulate and block MCP tool calls and responses
Amazon Bedrock AgentCore Policy – Cedar-based engine to implement contextual grounded fine-grained authorization on AgentCore data plane operations
AWS Identity and Access Management and AWS IAM Identity Center – enterprise authentication, authorization, and single sign-on integration with identity providers
AWS Secrets Manager – secure credential storage and automatic rotation for service accounts
AWS CloudTrail – comprehensive API activity logging and audit trail generation
Amazon EventBridge – real-time alerting and remediation

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Applications layer: non-generative-AI solutions

Core services: model access