GENSEC01-BP01 Grant least privilege access to foundation model endpoints

Granting least privilege access to foundation model endpoints helps limit unintended access and encourages a zero-trust security framework. This best practice describes how to secure foundation model endpoints associated with generative AI workloads.

Desired outcome: When implemented, this best practice reduces the risk of unauthorized access to a foundation model endpoint and helps create a process to verify continuous adherence to least-privilege principle.

Benefits of establishing this best practice:

Implement a strong identity foundation - Least privilege access permissions foster access to foundation model endpoints only for authorized identities.
Apply security at all layers - Least privilege access permissions on endpoints provides an identity-based layer of security, regardless of the hosting paradigm.

Level of risk exposed if this best practice is not established: High

Implementation guidance

Least privilege access is important to establish an identity-based layer of security for generative AI workloads. It helps verify that access to foundation model endpoints is granted to authorized identities only while also helping verify the data received matches the authorization boundary of their role in their organization.

Amazon Bedrock, the Amazon Q family of applications, and Amazon SageMaker AI feature endpoint APIs. Client applications can access the APIs directly through SDKs, open source frameworks or custom abstraction layers. You can use AWS Identity and Access Management to limit access to foundation model endpoints to IAM roles. These roles should be granted least privilege access and utilize session durations and permissions boundaries to further control access. AWS PrivateLink connections can be established from customer VPCs to Amazon generative AI services to further secure communication.

Additionally, model access can be controlled at the organization layer through other policy types such as Service Control Policies, Resource Control Policies, Session Policies and Permission boundaries. These policy types can provide ways to block or restrict models your organization has not approved in addition to services you may want to restrict by accounts, regions, organization and the maximum permissible boundary allowed for IAM users. Other policy types offered by Amazon Q Developer manage access through a subscription model. When provisioning subscription-level access to a generative AI service, confirm that the user needs that access and that subscription level matches the required access level to the service. Identity based permissions and subscription based service access can be managed through single-sign-on (SSO) to integrate with your enterprise identity provider.

Implementation steps

Create a custom policy document granting least-privilege access to set of specific foundation model endpoints.
- Limit access to specific resource ARNs and to a specific set of actions.
- Consider defining conditions to further restrict the allowable traffic, such as requests coming from a specific VPC.
Create an IAM role to be used by users or services to access the endpoint and attach the custom policy to it. If more permissions are needed for this role, attach the required policies on as-needed bases.
- Utilize permission boundaries at the role level to set the maximum permissions that an identity-based policy can grant.
- Conditions can be added to a role's trust policy to further limit access to who can assume the role.
Verify the new role for API calls to endpoints are protected by this policy.
- An example of an endpoint to protect might be a production Amazon Bedrock endpoint servicing real-time inference through a VPC-Hosted application.
For a generative AI subscription based generative AI application such as Amazon Q Developer, provision subscription-level access matching the subscriber's business needs.

Resources

Related practices:

Related guides, videos, and documentation:

Related examples:

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Endpoint security

GENSEC01-BP02 Implement private network communication between foundation models and applications