Creating a BigQuery connection - AWS Glue

Creating a BigQuery connection

To connect to Google BigQuery from AWS Glue, you will need to create and store your Google Cloud Platform credentials in a AWS Secrets Manager secret, then associate that secret with a Google BigQuery AWS Glue connection.

To configure a connection to BigQuery:
  1. In Google Cloud Platform, create and identify relevant resources:

  2. In Google Cloud Platform, create and export service account credentials:

    You can use the BigQuery credentials wizard to expedite this step: Create credentials.

    To create a service account in GCP, follow the tutorial available in Create service accounts.

    • When selecting project, select the project containing your BigQuery table.

    • When selecting GCP IAM roles for your service account, add or create a role that would grant appropriate permissions to run BigQuery jobs to read, write or create BigQuery tables.

    To create credentials for your service account, follow the tutorial available in Create a service account key.

    • When selecting key type, select JSON.

    You should now have downloaded a JSON file with credentials for your service account. It should look similar to the following:

    { "type": "service_account", "project_id": "*****", "private_key_id": "*****", "private_key": "*****", "client_email": "*****", "client_id": "*****", "auth_uri": "https://accounts.google.com/o/oauth2/auth", "token_uri": "https://oauth2.googleapis.com/token", "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs", "client_x509_cert_url": "*****", "universe_domain": "googleapis.com" }
  3. base64 encode your downloaded credentials file. On an AWS CloudShell session or similar, you can do this from the command line by running cat credentialsFile.json | base64 -w 0. Retain the output of this command, credentialString.

  4. In AWS Secrets Manager, create a secret using your Google Cloud Platform credentials. To create a secret in Secrets Manager, follow the tutorial available in Create an AWS Secrets Manager secret in the AWS Secrets Manager documentation. After creating the secret, keep the Secret name, secretName for the next step.

    • When selecting Key/value pairs, create a pair for the key credentials with the value credentialString.

  5. In the AWS Glue Data Catalog, create a connection by following the steps in https://docs.aws.amazon.com/glue/latest/dg/console-connections.html. After creating the connection, keep the connection name, connectionName, for the next step.

    • When selecting a Connection type, select Google BigQuery.

    • When selecting an AWS Secret, provide secretName.

  6. Grant the IAM role associated with your AWS Glue job permission to read secretName.

  7. In your AWS Glue job configuration, provide connectionName as an Additional network connection.