Best practice 3.3 – Understand data classifications and their protection policies
Data classification in your organization is key to determining how data must be protected while at rest and in transit. For example, since an analytics workload necessarily copies and shares data between operations and systems, we recommend that access be controlled to certain data classifications. Such a data protection strategy helps to prevent data loss, theft, and corruption, and helps to minimize the impact caused by malicious activities or unintended access.
Suggestion 3.3.1 – Identify classification levels
Use the Data Classification whitepaper to help you identify different classification levels. Four common levels used are restricted, confidential, internal, and public, however, these levels can vary based on the industry and compliance requirements of your organization.
Suggestion 3.3.2 – Define access rules
The data owners should define the data access rules based on the sensitivity and criticality of the data. For example, with AWS Lake Formation, you can define and enforce access controls that operate at the table, column, row, and cell level for all the users that access your data lake.
For more details, refer to the following information:
-
AWS Security Blog: How to scale your authorization needs by using attribute-based access control with
S3 . -
AWS Big Data Blog: Create a secure data lake by masking, encrypting data, and enabling fine-grained access with AWS Lake Formation.
-
AWS Big Data Blog: Control data access and permissions with AWS Lake Formation and Amazon EMR
. -
AWS Big Data Blog: Enforce column-level authorization with QuickSight and AWS Lake
Formation .
Suggestion 3.3.3 – Identify security zone models to isolate data based on classification
Design the security zone models from AWS account levels down to AWS resource levels. For example, consider building AWS multi-account models to isolate different classes of data from AWS account level. Or, you can consider separating out development and test resources from production ones from AWS account level or from resource levels.
For more details, refer to the following information:
-
AWS Whitepaper: An Overview of the AWS Cloud Adoption Framework.
-
AWS Whitepaper: Organizing Your AWS Environment Using Multiple Accounts.
-
AWS Whitepaper: Security Pillar – AWS Well-Architected Framework.
Suggestion 3.3.4 – Identify sensitive information and define protection policies
Discover sensitive data by using custom data identifiers in Amazon Macie or using AWS Glue sensitive data detection. Based on the sensitivity and criticality of the data, implement data protection policies to prevent unauthorized access. Due to compliance requirements, data might be masked or deleted after processing in some cases.
For more details, refer to the following information: