Prerequisites for developing blueprints - AWS Glue

Prerequisites for developing blueprints

To develop blueprints, you should be familiar with using AWS Glue and writing scripts for Apache Spark ETL jobs or Python shell jobs. In addition, you must complete the following setup tasks.

  • Download four AWS Python libraries to use in your blueprint layout scripts.

  • Set up the AWS SDKs.

  • Set up the AWS CLI.

Download the Python libraries

Download the following libraries from GitHub, and install them into your project:

Set up the AWS Java SDK

For the AWS Java SDK, you must add a jar file that includes the API for blueprints.

  1. If you haven't already done so, set up the AWS SDK for Java.

  2. Download the client jar file that has access to the APIs for blueprints.

    • For Java 1.x: s3://awsglue-custom-blueprints-preview-artifacts/awsglue-java-sdk-preview/AWSGlueJavaClient-1.11.x.jar

    • For Java 2.x: s3://awsglue-custom-blueprints-preview-artifacts/awsglue-java-sdk-v2-preview/AwsJavaSdk-Glue-2.0.jar

  3. Add the client jar to the front of the Java classpath to override the AWS Glue client provided by the AWS Java SDK.

    export CLASSPATH=<path-to-preview-client-jar>:$CLASSPATH
  4. (Optional) Test the SDK with the following Java application. The application should output an empty list.

    Replace accessKey and secretKey with your credentials, and replace us-east-1 with your Region.

    import com.amazonaws.auth.AWSCredentials; import com.amazonaws.auth.AWSCredentialsProvider; import com.amazonaws.auth.AWSStaticCredentialsProvider; import com.amazonaws.auth.BasicAWSCredentials; import com.amazonaws.services.glue.AWSGlue; import com.amazonaws.services.glue.AWSGlueClientBuilder; import com.amazonaws.services.glue.model.ListBlueprintsRequest; public class App{ public static void main(String[] args) { AWSCredentials credentials = new BasicAWSCredentials("accessKey", "secretKey"); AWSCredentialsProvider provider = new AWSStaticCredentialsProvider(credentials); AWSGlue glue = AWSGlueClientBuilder.standard().withCredentials(provider) .withRegion("us-east-1").build(); ListBlueprintsRequest request = new ListBlueprintsRequest().withMaxResults(2); System.out.println(glue.listBlueprints(request)); } }

Set up the AWS Python SDK

The following steps assume that you have Python version 2.7 or later, or version 3.6 or later installed on your computer.

  1. Download the following boto3 wheel file. If prompted to open or save, save the file. s3://awsglue-custom-blueprints-preview-artifacts/aws-python-sdk-preview/boto3-1.17.31-py2.py3-none-any.whl

  2. Download the following botocore wheel file: s3://awsglue-custom-blueprints-preview-artifacts/aws-python-sdk-preview/botocore-1.20.31-py2.py3-none-any.whl

  3. Check your Python version.

    python --version
  4. Depending on your Python version, enter the following commands (for Linux):

    • For Python 2.7 or later.

      python3 -m pip install --user virtualenv source env/bin/activate
    • For Python 3.6 or later.

      python3 -m venv python-sdk-test source python-sdk-test/bin/activate
  5. Install the botocore wheel file.

    python3 -m pip install <download-directory>/botocore-1.20.31-py2.py3-none-any.whl
  6. Install the boto3 wheel file.

    python3 -m pip install <download-directory>/boto3-1.17.31-py2.py3-none-any.whl
  7. Configure your credentials and default region in the ~/.aws/credentials and ~/.aws/config files. For more information, see Configuring the AWS CLI in the AWS Command Line Interface User Guide.

  8. (Optional) Test your setup. The following commands should return an empty list.

    Replace us-east-1 with your Region.

    $ python >>> import boto3 >>> glue = boto3.client('glue', 'us-east-1') >>> glue.list_blueprints()

Set up the preview AWS CLI

  1. If you haven't already done so, install and/or update the AWS Command Line Interface (AWS CLI) on your computer. The easiest way to do this is with pip, the Python installer utility:

    pip install awscli --upgrade --user

    You can find complete installation instructions for the AWS CLI here: Installing the AWS Command Line Interface.

  2. Download the AWS CLI wheel file from: s3://awsglue-custom-blueprints-preview-artifacts/awscli-preview-build/awscli-1.19.31-py2.py3-none-any.whl

  3. Install the AWS CLI wheel file.

    python3 -m pip install awscli-1.19.31-py2.py3-none-any.whl
  4. Run the aws configure command. Configure your AWS credentials (including access key, and secret key) and AWS Region. You can find information on configuring the AWS CLI here: Configuring the AWS CLI.

  5. Test the AWS CLI. The following command should return an empty list.

    Replace us-east-1 with your Region.

    aws glue list-blueprints --region us-east-1