Data storage and interoperability - Digital Strategies for Vaccine Distribution and Administration

Data storage and interoperability

As outlined previously, managing immunization information at scale is a major challenge. The AWS Cloud can help address this challenge through flexible and scalable storage and databases. The AWS Data Lake solution automatically crawls data sources, identifies data formats, and then suggests schemas and transformations, so organizations don’t have to spend time hand-coding data flows. For example, if a user uploads a series of EHRs to Amazon S3, AWS Glue, a fully managed extract, transform, and load (ETL) tool, can scan these documents to identify the schema and data types present in the files. This metadata is then stored in a catalog to be used in subsequent transforms and queries.

AWS HealthLake, currently in preview as of the date of this publication, is a HIPAA- eligible service that enables healthcare providers, health insurance companies, and pharmaceutical companies to store, transform, query, and analyze health data at scale.

The AWS Lake Formation service builds on the existing data lake solution by enabling organizations to set up a secure data lake within days. Once a user defines where their data lake is located, AWS Lake Formation collects and catalogs this data, moves the data into Amazon S3 for secure access, and cleans and classifies the data using ML algorithms.

Some healthcare systems may have interfaces that are proprietary or that conform to older standards, such as Health Level 7 (HL7). These systems can fail to achieve healthcare interoperability. This leads to a situation where data is digitally available, but not generally accessible. Fast Healthcare Interoperability Resources (FHIR), which is an interoperability standard for the electronic exchange of healthcare information, addresses this issue. An open-source project called FHIR Works on AWS is available to organizations to integrate with their products.

S3 combined with AWS Glue and AWS Lake Formation act as a centralized data lake for storing vaccine-related information from multiple sources with disparate data formats. FHIR provides data interoperability. Amazon DynamoDB, a key-value document database, provides fast access to these documents by storing patient metadata.