Security perspective: Compliance and assurance of AI systems - AWS Cloud Adoption Framework for Artificial Intelligence, Machine Learning, and Generative AI

Security perspective: Compliance and assurance of AI systems

Security is top priority at AWS, and all customers, regardless of size, benefit from AWS’s ongoing investment in our secure infrastructure and new offerings. For customers developing AI AWS workloads, security is an integral part of the overall AWS solution. Generative AI is a key enabler in scaling Foundation Models for realizing business outcomes and there are multiple ways to create a Generative AI workload. Integrating security and privacy in all aspects of AI is critical for the overall success of business outcomes. The underlying business case of using AI is to solve specific business problems that can range from simple automation of routine productivity tasks to complex healthcare or financial decisions containing sensitive data. Apply risk management techniques to implement security and privacy capabilities defined in this perspective to meet your business needs.

Foundational Capability Explanation
Vulnerability Management Continuously identify, classify, remediate, and mitigate AI vulnerabilities
Security Governance Establish security policies, standards and guidelines along with roles and responsibilities related to AI workloads
Security Assurance Apply, evaluate, and validate security and privacy measures against regulatory and compliance requirements for AI workloads
Threat Detection Detect and mitigate potential AI-related security threats or unexpected behaviors in AI workloads
Infrastructure Protection Secure the systems and services used to operate AI workloads
Data Protection Maintain visibility, secure access and control over data used for AI development and use
Application Security Detect and mitigate vulnerabilities during the software development lifecycle process of AI workloads
Identity and Access Management (IAM) This capability is not enriched for AI, refer to the AWS CAF.
Incident Response This capability is not enriched for AI, refer to the AWS CAF.

Vulnerability management

Continuously identify, classify, remediate, and mitigate AI vulnerabilities.

AI systems may have technology -specific vulnerabilities that you should be aware of; for example, prompt injection, data poisoning, and model inversion vulnerabilities. The three critical components of any AI system are its inputs, model, and outputs. These components can be protected with the following best practices to mitigate potential vulnerabilities for your workloads.

  • The Input vulnerability relates to all of the data that has an entry point to your model. This input can be a source of targeted model and distribution drift, where a threat actor attempts to influence decisions over time, or purposefully introduces a hidden bias or sensitivity to certain data. Harden these inputs through data quality automation and continuous monitoring. Model misuse is an example of a vulnerability that results from prompt injection in AI solutions since data and instructions cloud be interlaced with each other and it’s worth paying special attention to the quickly evolving field of jailbreaking foundation models. Perform input validation to segregate data from instructions and use least privilege principles by limiting access of Large Language Models (LLMs) to specific authorizations. Avoid access to system commands, executable files, and log actions that have widespread operational impact.

  • The Model vulnerability relates to exploiting misrepresentations of the real world or the seen data in the model. Harden your model by mitigating known documented threats using threat modeling. While using commercial generative AI models, review their sources of data, terms of use for model fine-tuning, and vulnerabilities that could impact you from the model itself or from the use of third-party libraries. Validate that model goals and their results are monitored and that they remain consistent over time to avoid model drift.

  • The Output vulnerability relates to interacting with the system over a long period of time, which can allow critical information to be inferred about the inputs and properties of your model, often called data leakage. For generative AI, validate that its output is sanitized and not consumed directly to mitigate cross-site vulnerabilities and remote execution. These are just a few of the vulnerabilities that you need to consider for your workloads. While not every AI system exposes these vulnerabilities, be vigilant about risks that apply to your specific workload. Perform regular testing, Game Days, and table top exercises to validate remediation prescribed by playbooks.

Security governance

Establish security policies, standards, and guidelines along with roles and responsibilities related to AI workloads.

Validate that policies are clearly defined for use of commercial or open-source models hosted internally and externally. Similarly, for commercial generative AI model usage, consider the risks from your organization’s sensitive data leakage to the commercial model’s platform (see Data protection capability). Understand the assets, security risks, and compliance requirements associated with AI that apply to your industry or organization to help you prioritize your security efforts. Allocate sufficient security resources to identified roles and provide visibility.

Risks associated with AI can have far-reaching consequences, including privacy breaches, data manipulation, abuse, and compromised decision-making. Implementing robust encryption, multi-factor authentication, continuous monitoring and alignment to risk tolerance and frameworks (for example, NIST AI RMF) can be essential to safeguard the integrity and confidentiality of an AI environment.

Provide ongoing direction and advice for the three critical components for your workloads:

  • The Input – Establish who can approve data sources and the use of AI. Consider in the approval process data aspects such as data classification or sensitivity, existence of regulated data within the data sets, data provenance, data obsolescence, or the lawful right to process the data. To manage risks, evaluate the mechanisms used to source input data considering factors such as the reputation of the source, the manner in which it was received, and how it is being stored or secured. Validate that the data classification of the source data is in alignment with the solutions’ classification, such as not allowing confidential data to be processed on a public AI solution.

  • The Model – Establish roles and responsibilities for creating and training models. Establish roles associated with an author, approver, publisher approach to model release. To manage risks, evaluate the model training mechanisms, including the tools and individuals involved, to avoid an intentional or unintentional introduction of vulnerabilities. Evaluate the model’s architecture for vulnerabilities that influence the output. Enable failure modes of any model to fail to a closed or secure state to avoid data exposure.

  • The Output – Establish lifecycle management of created outputs. Establish classification criteria, paying close attention to the outcome of datasets that may have been potentially disparate datasets or of dissimilar data classifications. To manage risks, determine appropriate protection and retention controls, classify your data based on criticality and sensitivity, such as Personally Identifiable Information (PII), and define appropriate access controls. Define data protection controls and lifecycle management policies. Establish robust data sharing protocols with privacy regulations and other compliance alignment.

Security assurance

Apply, evaluate, and validate security and privacy measures against regulatory and compliance requirements for AI workloads.

Your organization, and the customers you serve, need to have trust and confidence in the controls that you have implemented. As your customers’ and users’ awareness and sensitivity for AI-related risks and potential misuse increases, so do their expectations that a high security bar is met. Design, develop, deploy and monitor solutions in a manner that prioritizes cybersecurity, meets regulatory requirements, and effectively and efficiently manages security risks that are specific to AI and are in line with your business objectives and risk tolerance. Meticulously monitoring, providing transparency and collaboration between legal experts, compliance professionals, data scientists and information technology professionals will help validate a holistic approach to assurance. Implementing testing procedures and remediation processes can enable a proactive approach to assurance. Continuously monitor and evaluate for the three critical components for your workloads:

  • The Input – Because models often require vast amounts of data for training and analysis, you need to validate the type of data ingested is aligned to the model’s goals and outcomes. Establish auditmechanisms to understand adherence to the established control framework.

  • The Model – Certify that the users understand what is acceptable usage of AI in alignment with organizational policies. Implement policies and controls to validate that the organization understands where it is appropriate to use AI and where it is not. Establish audit mechanisms to identify how the model is using data and where AI capabilities are in use within the organization.

  • The Output – Establish acceptable usage criteria for the output paying attention to where the data may be reused or reintroduced to additional AI models. Establish discoveryor audit mechanisms to review output data to validate that generated data cannot be used to infer or recreate sensitive or regulated data. Create mechanisms for validating the authenticity and origin of the output where trustworthiness is paramount, such as medical diagnoses.

Preserving individual privacy necessitates strict adherence to ethical and legal guidelines to prevent unauthorized access, misuse, or disclosure of the data. Balancing the potential of AI and respecting privacy rights fosters public trust and realizes the benefits of these capabilities. See MLSEC-05: Protect sensitive data privacy in the Well-Architected Framework for safeguard information. Establish transparency and mechanisms such as informed consent. Limit data retention to only what is necessary for functionality and implement data sharing agreements. Again, consider the privacy requirements associated with the three critical components of your workloads:

  • The Input – Validate that you understand how data that is subject to privacy related regulations (for example - GDPR, CCPA, COPPA, PDPA) could be used and that legal basis for processing the data exists. Consider data residency and where data is stored or processed. Establish Privacy Impact Assessments (PIA) or similar processes for each use of regulated data.

  • The Model – As the model is trained or tuned, consider whether or not the legal basis for processing data exists and that transparency for the data subject can be demonstrated. Establish Privacy Impact Assessments or similar processes related to potential leakage from the models.

  • The Output – Consider whether regulated data is being used to train additional models and whether restricted secondary usages of personal data limitations apply. Establish mechanisms to accomplish right to erasure or right to be forgotten type requests. Establish discovery or audit mechanisms to review output data to validate that generated data cannot be used to infer or recreate previously de-identified data.

Threat detection

Detect and mitigate potential security threats or unexpected behaviors in AI workloads.

To improve the protection of the three critical components of any ML or generative AI system (its inputs, model, and outputs) use the following best practices to detect and mitigate threats to your workloads:

  • The Input – Detection of threats for AI solutions is critical to mitigate vulnerabilities that could impact your business. Sanitize input data to detect threats at the onset of model usage. Continue to track input data for user sessions to detect and mitigate threats that could impact availability and misuse.

  • The Model – Conduct threat modeling specific to the AI system and threat huntingexercises to detect and mitigate potential threats. Update threat models and monitoring to include AI threat concepts that include training models with unexpected user inputs, poisoning of data sets used for content or training, privacy breaches, and data tampering. Correlate input data and data used by the model to detect anomalous or malicious activity.

  • The Output – Monitor for output anomalies that deviate from model goals, and enable checks for detecting sensitive data in model outputs. Build a threat catalog that includes identified applicable known threats that apply to your workloads. Create automated tests to validate detection capabilities and the integration of threat intelligence to increase efficacy and to reduce false positives. Consider usage of threat intelligence to increase efficacy and reduce false positives.

Infrastructure protection

Secure the systems and services used to operate AI workloads.

MLOps uses DevOps practices for AI workloads and security needs to be applied to the infrastructure that makes up the overall environment. Use secure endpoints for your AI model and Amazon API Gateway for rate-limiting model access. Use API security best practices for all internal and external APIs used and create an explicit allow-list of API calls from models outside of its own VPC.Begin with security capabilities as prescribed by the Security Reference Architecture and apply network, compute, and storage security controls based on your environment.

Models are distributed over multiple environments across networks and servers. Communication between these environments should be protected using encryption in transit. Use centralized configuration of development and production environments and apply preventive and detective guardrails that are managed independently by security administrators. Isolate development environments for sensitive tasks such as model training. Validate that there is session isolation for end users to preserve experience integrity and prevent unintended data disclosure. Log output responses and related session data to Write Once Read Many (WORM) storage devices for compliance and troubleshooting purposes. Consider using a model bug-bounty program to uncover and mitigate edge use cases that could cause a security issue.

Data protection

Maintain visibility, secure access, and control over data used for AI development and use.

Data protection is critical throughout the AI development lifecycle, and where data protection policies defined by security governance are operationalized, like MLSEC-07: Keep only relevant data in the Machine Learning Lens of the Well-Architected Framework. If using commercial models for generative AI development, be aware that direct use of data as input to the model could disclose sensitive information. Likewise, letting your proprietary or self-hosted models access protected data can open the door for data-related privilege escalations. Evaluate model usage and service terms accordingly.Data collected for model development during the pre-training and fine-tuning phases of model development should be secured in transit, at rest, and in use. Consider using a data tokenization process to replace sensitive data with non-sensitive data tokens as part of a data preprocessing phase that includes cleaning, normalization, and transformation. Create verifiable mechanisms for all sources of data used by the models, especially inference data that is used to train the models. Monitor and create alerts for sensitive data or data that could result in sensitivity class escalation. Employ data activity monitoring techniques to detect access patterns by usage, frequency, and so on. Avoid using sensitive data to train the models, as that could cause unintended disclosure of data from the model output (for example, through data leakage during inference). Tag and label data that is used for training in all the different environments, and align data tags and labels to data classification policies and standards. Validate that data lineage and data access in non-production and development regions is controlled to prevent data manipulation that would introduce vulnerabilities in the model. Consider using CI/CD pipelines to promote data to testing and production environments to preserve integrity. Log and mask sensitive data while creating an audit trail for data access. Implement data loss prevention techniques on sensitive data stores and on data stores that are, by design, not supposed to store data of specified data classes (for example, Confidential), and monitor for unintended disclosure of sensitive data. Validate data quality for model outputs to enable trust and avoid hallucinations. Monitor sensitivity levels of model data output and trigger re-classification by redaction or quarantined response if the sensitivity levels rise. For example, if new input data sets are used by the model or used to train the model, validate that the output data conforms to the existing sensitivity level.

Application security

Detect and mitigate vulnerabilities during the software development lifecycle process of AI workloads.

Verify that model developers execute prompt testing and other security test cases locally in their environment and also in the CI/CD pipelines to validate model usage. Create and maintain test case libraries to validate coverage and to enable automation. Leverage data and model pipelines that are integrated with security scans across all development, test, and production environments and store all of your model artifacts in secure repositories . Maintain an inventory of AI models and assign model instances with specifically identified technical and business owners. Validate that known good trained models are backed up. Retain points-in-time recovery so that compromised models can return to a known good state. Protect access to model and data backups to validate that they are not compromised, and test model recovery periodically to enable full recovery to a known good state. Track data related to the model and data development including parameters, metadata, and so on for provenance in order to support the validity of the output results. Create and use operational runbooks and test roll-back mechanisms independently for data sets and models that can be executed in case of operational or security incidents to provide resilience.