Building serverless architectures for agentic AI on AWS - AWS Prescriptive Guidance

Building serverless architectures for agentic AI on AWS

Aaron Sempf, Amazon Web Services

July 2025 (document history)

The convergence of AI and serverless computing is reshaping the landscape of modern enterprise architecture. In response, organizations are striving to deliver intelligent capabilities at scale. They face increasing pressure to reduce operational overhead, accelerate innovation, and deploy applications that can adapt in real time to user behavior and system events.

Serverless AI on AWS represents a fundamental shift toward intelligent, adaptive, cloud-native systems. With the right strategy and tooling, organizations can unlock faster innovation cycles, lower costs, and greater scalability. This approach positions them at the forefront of the next generation of enterprise computing. AWS is enabling this shift through a combination of fully managed AI services and event-driven, serverless infrastructure.

This guide outlines the strategic and technical foundations for building AI-native, serverless architectures on AWS. These architectures are scalable, cost-effective, and capable of delivering real-time intelligence without the complexity of managing infrastructure.

Intended audience

This guide is for architects, developers, and technology leaders seeking to harness the power of AI-driven software agents within modern cloud-native applications.

Objectives

This guide helps you do the following:

  • Understand the AWS native services available for agentic AI solution development

  • Operationalize agentic AI with cloud-scale reliability

  • Align AI execution with business outcomes and cost models

  • Establish a framework for secure, governed AI adoption

About this content series

This guide is part of a set of publications that provide architectural blueprints and technical guidance for building AI-driven software agents on AWS. The AWS Prescriptive Guidance series includes the following:

For more information about this content series, see Agentic AI.

The business case of serverless AI

Serverless computing provides an ideal foundation for modern AI workloads. AI applications often require intermittent, compute-intensive inference, especially in use cases such as fraud detection, recommendation engines, document summarization, and customer service automation. Traditional infrastructure models can be expensive and operationally complex when managing unpredictable or spiky workloads.

In contrast, serverless architectures offer significant advantages. They scale automatically, execute on-demand, reduce operational overhead, and charge only for resources used. These features make serverless architectures well-suited for embedding AI into modern cloud-native applications. AWS offers a comprehensive portfolio of services that combine serverless and AI capabilities. These services include Amazon SageMaker Serverless Inference and Amazon Bedrock, which provides access to foundation models through a fully managed, API-based interface. Additionally, AWS Lambda and AWS Step Functions enable the development of agile, cost-aligned, and production-ready AI systems. When paired with services like Amazon Bedrock and SageMaker Serverless Inference, these tools offer powerful support for AI workloads.

AI workloads, particularly inference, are often unpredictable and bursty. In traditional architectures, this leads to overprovisioned infrastructure, increased costs, and complexity in scaling. Serverless models solve these issues by offering:

  • Elastic scalability – Resources scale automatically based on demand.

  • Cost optimization – No charges for idle compute. Pay only for execution time.

  • Reduced operational overhead – Fewer operations, less to manage, and fewer dependencies on other technology, processes, or resources.

  • Faster time to market – Developers can focus on business logic and model performance instead of managing servers.

  • High availability and built-in resilience – AWS serverless offerings provide these capabilities by default.

These capabilities make serverless a natural fit for deploying AI models across a wide variety of use cases, from fraud detection and personalized recommendations to document analysis and conversational AI.

AWS services powering serverless AI

AWS provides a robust suite of managed services that help teams embed intelligence into applications, orchestrate workflows, and react to events without managing infrastructure: