Defining connections in the AWS Glue Data Catalog
An AWS Glue connection is a Data Catalog object that stores login credentials, URI strings, virtual private cloud (VPC) information, and more for a particular data store. AWS Glue crawlers, jobs, and development endpoints use connections in order to access certain types of data stores. You can use connections for both sources and targets, and reuse the same connection across multiple crawler or extract, transform, and load (ETL) jobs.
AWS Glue supports the following connection types:
-
JDBC
-
Amazon Relational Database Service (Amazon RDS)
-
Amazon Redshift
-
Amazon DocumentDB
-
Kafka
-
MongoDB
-
Network (designates a connection to a data source that is in an Amazon Virtual Private Cloud (Amazon VPC))
With AWS Glue Studio, you can also create a connection for a connector. A connector is an optional code package that assists with accessing data stores in AWS Glue Studio. For more information, see Using connectors and connections with AWS Glue Studio
For information about how to connect to on-premises databases, see How to access and analyze on-premises data stores using
AWS Glue
This section includes the following topics to help you use AWS Glue connections:
- AWS Glue connection properties
- Storing connection credentials in AWS Secrets Manager
- Adding an AWS Glue connection
- Testing an AWS Glue connection
- Configuring AWS calls to go through your VPC
- Connecting to a JDBC data store in a VPC
- Using a MongoDB connection
- Crawling an Amazon S3 data store using a VPC endpoint
- Troubleshooting connection issues in AWS Glue