Integrating a traditional cloud workload with Amazon Bedrock - AWS Prescriptive Guidance

Integrating a traditional cloud workload with Amazon Bedrock

The scope of this use case is to demonstrate a traditional cloud workload that is integrated with Amazon Bedrock to take advantage of generative AI capabilities. The following diagram illustrates the Generative AI account in conjunction with an example application account. 

Integrating a traditional cloud workload with Amazon Bedrock

The Generative AI account is dedicated to providing generative AI functionality by using Amazon Bedrock. The Application account is an example sample workload. The AWS services that you use in this account depend on your requirements. Interactions between the Generative AI account and the Application account use the Amazon Bedrock APIs. 

The Application account is separated from the Generative AI account to help group workloads based on business purposes and ownership. This helps constrain access to sensitive data in the generative AI environment and supports the application of distinct security controls by environment. Keeping the traditional cloud workload in a separate account also helps limit the scope of impact of adverse events

You can build and scale enterprise generative AI applications around various use cases that are supported by Amazon Bedrock. Some common use cases are text generation, virtual assistance, text and image search, text summarization, and image generation. Depending on your use case, your application component interacts with one or more Amazon Bedrock capabilities such as knowledge bases and agents. 

Application account

The Application account hosts the primary infrastructure and services to run and maintain an enterprise application. In this context, the Application account acts as the traditional cloud workload, which interacts with the Amazon Bedrock managed service in the Generative AI account. See the Workload OU Application account section for general security best practices for securing this account. 

Standard application security best practices apply as in other applications. If you plan to use  retrieval augmented generation (RAG), where the application queries relevant information from a knowledge base such as a vector database by using a text prompt from the user, the application needs to propagate the identity of the user to the knowledge base, and the knowledge base enforces your role-based or attribute-based access controls.

Another design pattern for generative AI applications is to use agents to orchestrate interactions between a foundation model (FM), data sources, knowledge bases, and software applications. The agents call APIs to take actions on behalf of the user who is interacting with the model. The most important mechanism to get right is to make sure that every agent propagates the identity of the application user to the systems that it interacts with. You must also ensure that each system (data source, application, and so on) understands the user identity, limits its responses to actions that the user is authorized to perform, and responds with data that the user is authorized to access.

It's also important to limit direct access to the pre-trained model's inference endpoints that were used to generate inferences. You want to restrict access to the inference endpoints to control costs and monitor activity. If your inference endpoints are hosted on AWS, such as with Amazon Bedrock base models, you can use IAM to control permissions to invoke inference actions. 

If your AI application is available to users as a web application, you should protect your infrastructure by using controls such as web application firewalls. Traditional cyber threats such as SQL injections and request floods might be possible against your application. Because invocations of your application cause invocations of the model inference APIs, and model inference API calls are usually chargeable, it's important to mitigate flooding to minimize unexpected charges from your FM provider. Web application firewalls don't protect against prompt injection threats, because these threats are in the form of natural language text. Firewalls match code (for example, HTML, SQL, or regular expressions) in places where it's unexpected (text, documents, and so on). To help protect against prompt injection attacks and ensure model safety, use guardrails

Logging and monitoring inference in generative AI models is crucial for maintaining security and preventing misuse. It enables the identification of potential threat actors, malicious activities, or unauthorized access, and helps enable timely intervention and mitigation of risks that are associated with the deployment of these powerful models.

Generative AI account

Depending on the use case, the Generative AI account hosts all generative AI activities. These include, but aren't limited to, model invocation, RAG, agents and tools, and model customization. See the previous sections that discuss specific use cases to see which features and implementation are necessary for your workload. 

The architectures presented in this guide offer a comprehensive framework for organizations that use AWS services to take advantage of generative AI capabilities securely and efficiently. These architectures combine the fully managed functionality of Amazon Bedrock with security best practices to provide a solid foundation for integrating generative AI into traditional cloud workloads and organizational processes. The specific use cases covered, including providing generative AI FMs, RAG, agents, and model customization, address a wide range of potential applications and scenarios. This guidance equips organizations with the necessary understanding of AWS Bedrock services and their inherent and configurable security controls, enabling them to make informed decisions tailored to their unique infrastructure, applications, and security requirements.