Using the AWS Serverless Application Repository to Deploy a Hive Data Source Connector - Amazon Athena

Using the AWS Serverless Application Repository to Deploy a Hive Data Source Connector

You can also use the AWS Serverless Application Repository to deploy an Athena data source connector for Hive. Choose the connector that you want to use, provide the parameters that the connector requires, and then deploy the connector to your account.

To use the AWS Serverless Application Repository to deploy a data source connector for Hive to your account

  1. Sign in to the AWS Management Console and open the Serverless App Repository.

  2. In the navigation pane, choose Available applications.

  3. Select the option Show apps that create custom IAM roles or resource policies.

  4. In the search box, type the name of one of the following connectors. The two applications have the same functionality and differ only in their implementation. You can use either one to create a Lambda function that connects Athena to your Hive metastore.

    • AthenaHiveMetastoreFunction – Uber Lambda function .jar file.

    • AthenaHiveMetastoreFunctionWithLayer – Lambda layer and thin Lambda function .jar file.

  5. Choose the name of the connector.

    
                    Choose the name of the Athena data source connector for Hive.
  6. Under Application settings, enter the parameters for your Lambda function.

    • LambdaFuncName – Provide a name for the function. For example, myHiveMetastore.

    • SpillLocation – Specify an Amazon S3 location in this account to hold spillover metadata if the Lambda function reponse size exceeds 4MB.

    • HMSUris – Enter the URI of your Hive metastore host that uses the Thrift protocol at port 9083. Use the syntax thrift://<host_name>:9083.

    • LambdaMemory – Specify a value from 128MB to 3008MB. The Lambda function is allocated CPU cycles proportional to the amount of memory that you configure. The default is 1024.

    • LambdaTimeout – Specify the maximum permissible Lambda invocation run time in seconds from 1 to 900 (900 seconds is 15 minutes). The default is 300 seconds (5 minutes).

    • VPCSecurityGroupIds – Enter a comma-separated list of VPC security group IDs for the Hive metastore.

    • VPCSubnetIds – Enter a comma-separated list of VPC subnet IDs for the Hive metastore.

  7. On the bottom right of the Application details page, select I acknowledge that this app creates custom IAM roles, and then choose Deploy.

At this point, you can configure Athena to use your Lambda function to connect to your Hive metastore. For more information, see steps 8-15 of Connecting Athena to an Apache Hive Metastore.