Design principles - AWS Well-Architected

Design principles

The following design principles apply to generative AI workloads created on AWS:

  • Design for controlled autonomy: Implement comprehensive guardrails and boundaries that govern how AI systems operate, scale, and interact. By establishing clear operational requirements, security controls, and failure conditions, you can keep AI systems within safe, efficient, and cost-effective parameters while maintaining reliability. This principle addresses security, cost optimization, and reliability concerns for autonomous AI operations.

  • Implement comprehensive observability: Monitor and measure specific aspects of your generative AI system, from security and performance to cost and environmental impact. By collecting metrics across every layer, including user feedback, model behavior, resource utilization, and security events, you can maintain operational excellence while optimizing system behavior. This holistic approach enables data-driven decisions about system improvements and rapid problem resolution.

  • Optimize resource efficiency: Select and configure AI components based on empirical requirements rather than assumptions. By right-sizing models, optimizing data operations, and implementing dynamic scaling, you can balance performance needs with cost and sustainability goals. This principle helps you achieve efficient resource utilization while maintaining necessary capabilities and reducing environmental impact.

  • Establish distributed resilience: Design systems that remain operational despite component or regional failures. By implementing redundancy, automated recovery mechanisms, and geographic distribution of resources, you can maintain consistent service delivery while managing costs and performance. This helps you achieve reliability while supporting efficient global operations.

  • Standardize resource management: Maintain centralized catalogs and controls for critical components like prompts, models, and access permissions. By implementing structured management systems, you can maintain security, govern resource usage, enable version control, and optimize costs while maintaining operational excellence.

  • Secure interaction boundaries: Protect and control data flows and system interfaces. By implementing least-privilege access, secure communications, input/output sanitization, and comprehensive monitoring, you can maintain system security while achieving reliable and efficient operations. This principle addresses security requirements while supporting overall system integrity.