Encrypting data written by crawlers, jobs, and development endpoints
A security configuration is a set of security properties that can be used by AWS Glue. You can use a security configuration to encrypt data at rest. The following scenarios show some of the ways that you can use a security configuration.
-
Attach a security configuration to an AWS Glue crawler to write encrypted Amazon CloudWatch Logs.
-
Attach a security configuration to an extract, transform, and load (ETL) job to write encrypted Amazon Simple Storage Service (Amazon S3) targets and encrypted CloudWatch Logs.
-
Attach a security configuration to an ETL job to write its jobs bookmarks as encrypted Amazon S3 data.
-
Attach a security configuration to a development endpoint to write encrypted Amazon S3 targets.
Important
Currently, a security configuration overrides any server-side encryption (SSE-S3) setting that is passed as an ETL job parameter. Thus, if both a security configuration and an SSE-S3 parameter are associated with a job, the SSE-S3 parameter is ignored.
For more information about security configurations, see Working with security configurations on the AWS Glue console.
Topics
Setting Up AWS Glue to use security configurations
Follow these steps to set up your AWS Glue environment to use security configurations.
-
Create or update your AWS Key Management Service (AWS KMS) keys to grant AWS KMS permissions to the IAM roles that are passed to AWS Glue crawlers and jobs to encrypt CloudWatch Logs. For more information, see Encrypt Log Data in CloudWatch Logs Using AWS KMS in the Amazon CloudWatch Logs User Guide.
In the following example,
"role1"
,"role2"
, and"role3"
are IAM roles that are passed to crawlers and jobs.{ "Effect": "Allow", "Principal": { "Service": "logs.
region
.amazonaws.com", "AWS": ["role1"
,"role2"
,"role3"
] }, "Action": [ "kms:Encrypt*", "kms:Decrypt*", "kms:ReEncrypt*", "kms:GenerateDataKey*", "kms:Describe*" ], "Resource": "*" }The
Service
statement, shown as"Service": "logs.
, is required if you use the key to encrypt CloudWatch Logs.region
.amazonaws.com" -
Ensure that the AWS KMS key is
ENABLED
before it is used. -
Ensure that the AWS Glue job includes the following code in the beginning of your script for the security setting to take effect.
job = Job(glueContext) job.init(args['JOB_NAME'], args)
Creating a route to AWS KMS for VPC jobs and crawlers
You can connect directly to AWS KMS through a private endpoint in your virtual private cloud (VPC) instead of connecting over the internet. When you use a VPC endpoint, communication between your VPC and AWS KMS is conducted entirely within the AWS network.
You can create an AWS KMS VPC endpoint within a VPC. Without this step,
your jobs or crawlers might fail with a kms timeout
on jobs or an
internal service exception
on crawlers. For detailed instructions, see Connecting to AWS KMS Through a VPC
Endpoint in the AWS Key Management Service Developer Guide.
As you follow these instructions, on the VPC
console
Select Enable Private DNS name.
Choose the Security group (with self-referencing rule) that you use for your job or crawler that accesses Java Database Connectivity (JDBC). For more information about AWS Glue connections, see Defining connections in the AWS Glue Data Catalog.
When you add a security configuration to a crawler or job that accesses JDBC data stores, AWS Glue must have a route to the AWS KMS endpoint. You can provide the route with a network address translation (NAT) gateway or with an AWS KMS VPC endpoint. To create a NAT gateway, see NAT Gateways in the Amazon VPC User Guide.