Overview - AI-Powered Health Data Masking


Healthcare organizations generate large amounts of health data such as medical images and patient information and send that data to different applications, including population health management and electronic health records. The challenge medical professionals and developers face is using medical information in applications while meeting their compliance obligations for health data, such as protected health information (PHI).

Currently, there are multiple methods to mask data and each organization has their own approaches based on internal risk assessments. AWS recommends you consult risk assessment specialists for your organization’s specific implementation process.

The AI-Powered Health Data Masking solution helps customers identify and mask health data in images or text. This solution uses Amazon Comprehend Medical to detect health data in a body of text, Amazon Rekognition to identify text in an image, Amazon API Gateway and AWS Lambda to provide an API interface for this functionality, and AWS Identity and Access Management (IAM) to authorize API requests.

This solution was designed to be used as part of a set of mitigating controls in your environment, and does not guarantee alignment to any regulatory framework.


If subject to HIPAA, you must have an AWS Business Associate Addendum (BAA) in place, and follow its configuration requirements, before running protected health information (PHI) workloads on AWS. You should not use your AWS account in connection with PHI until you have accepted the AWS BAA and configured your AWS account(s) as required by the AWS BAA. Under HIPAA regulations, covered entities and business associates are responsible for putting in place a business associate agreement between themselves and each of their business associates. You are solely responsible for determining whether you and your organization need a business associate agreement with AWS. If you determine you need a business associate agreement with AWS, you can accept the AWS BAA through a self-service portal in AWS Artifact. It is your responsibility to obtain a BAA from AWS. For more information about the AWS BAA, please visit the AWS HIPAA Compliance webpage.

This solution does not address state-specific laws that may apply to you. This solution only addresses requirements set forth under HIPAA, a U.S. federal law. Many individual states have adopted rules that are different and, in some cases, stricter than those that are federally mandated under HIPAA.

This solution will not, by itself, make you HIPAA-compliant. The information contained in this solution package is not exhaustive, and must be reviewed, evaluated, assessed, and approved by you in connection with your organization’s particular security features, tools, and configurations. However, it is the sole responsibility of you and your organization to determine which HIPAA regulatory requirements are applicable to you, and to ensure that you comply with those applicable requirements. Importantly, most of the requirements under HIPAA are not technical but administrative (that is, people- and process-oriented).

Note that it is your responsibility to ensure the outputs generated by this solution comply with any legal or other requirements applicable to your organization.


You are responsible for the cost of the AWS services used while running this solution. As of the date of publication, the cost for running this solution with default settings in the US East (N. Virginia) Region is approximately $0.015 per text and $0.01 per imagefor masking health data. After the model is trained, the cost to process data from the example dataset is less than $0.01 per hour. Pricing assumes a 1000-character text document and an image with 500 characters of text returned by Amazon Rekognition.

Prices are subject to change and may also be less if you qualify for AWS Lambda or Amazon Comprehend Medical free tiers. For full details, see the pricing webpage for each AWS service you will be using in this solution.