Amazon Elastic MapReduce
Developer Guide (API Version 2009-03-31)
« PreviousNext »
View the PDF for this guide.Go to the AWS Discussion Forum for this product.Go to the Kindle Store to download this guide in Kindle format.Did this page help you?  Yes | No |  Tell us about it...

Use the Hive JDBC Driver

The Hive JDBC driver provides a mechanism to connect to Hive and query data. Hive JDBC driver support also allows you to integrate Hive with business intelligence tools for reporting and analysis. To connect to Hive via JDBC requires you to download the JDBC driver and install a SQL client. The following example demonstrates using SQL Workbench/J to connect to Hive using JDBC.

To download JDBC drivers

Download and extract the drivers appropriate to the versions of Hive that you want to access. The Hive version differs depending on the AMI that you choose when you create an Amazon EMR cluster. For more information, see Choose a Machine Image.

To install and configure SQL Workbench

  1. Download the SQL Workbench/J client for your operating system from http://www.sql-workbench.net/downloads.html.

  2. Go to the Installing and starting SQL Workbench/J page and follow the instructions for installing SQL Workbench/J on your system.

    • Linux, Unix, Mac OS X users: In a terminal session, create an SSH tunnel to the master node of your cluster using one of the following commands. Replace master-public-dns-name with the public DNS name of the master node and path-to-key-file with the location and file name of your Amazon EC2 private key (.pem) file. For more information about retrieving the master public DNS name, see Retrieve the Public DNS Name of the Master Node.

      Hive versionCommand
      0.13.1ssh -o ServerAliveInterval=10 -i path-to-key-file -N -L 10000:localhost:10000 hadoop@master-public-dns-name
      0.11.0ssh -o ServerAliveInterval=10 -i path-to-key-file -N -L 10004:localhost:10004 hadoop@master-public-dns-name
      0.8.1ssh -o ServerAliveInterval=10 -i path-to-key-file -N -L 10003:localhost:10003 hadoop@master-public-dns-name
    • Windows users: In a PuTTY session, create an SSH tunnel to the master node of your cluster (using local port forwarding) with the following settings. Replace master-public-dns-name with the public DNS name of the master node. For more information about creating an SSH tunnel to the master node, see Option 1: Set Up an SSH Tunnel to the Master Node Using Local Port Forwarding.

      Hive versionTunnel settings
      0.13.1Source port: 10000 Destination: master-public-dns-name:10000
      0.11.0Source port: 10004 Destination: master-public-dns-name:10004
      0.8.1Source port: 10003 Destination: master-public-dns-name:10003
  3. Add the JDBC driver to SQL Workbench/J.

    1. In the Select Connection Profile dialog box, click Manage Drivers.

    2. Click the Create a new entry (blank page) icon.

    3. In the Name field, type Hive JDBC.

    4. For Library, click the Select the JAR file(s) icon.

    5. Browse to the location containing the extracted drivers, select the following JAR files and click Open.

      Hiver driver versionJAR files to add
      0.13.1
      hive_metastore.jar
      hive_service.jar
      HiveJDBC3.jar
      libfb303-0.9.0.jar
      libthrift-0.9.0.jar
      log4j-1.2.14.jar
      ql.jar
      slf4j-api-1.5.8.jar
      slf4j-log4j12-1.5.8.jar
      TCLIServiceClient.jar
      0.11.0
      hadoop-core-1.0.3.jar
      hive-exec-0.11.0.jar
      hive-jdbc-0.11.0.jar
      hive-metastore-0.11.0.jar
      hive-service-0.11.0.jar
      libfb303-0.9.0.jar
      commons-logging-1.0.4.jar
      slf4j-api-1.6.1.jar
      0.8.1
      hadoop-core-0.20.205.jar
      hive-exec-0.8.1.jar
      hive-jdbc-0.8.1.jar
      hive-metastore-0.8.1.jar
      hive-service-0.8.1.jar
      libfb303-0.7.0.jar
      libthrift-0.7.0.jar
      log4j-1.2.15.jar    
      slf4j-api-1.6.1.jar
      slf4j-log4j12-1.6.1.jar         
    6. In the Please select one driver dialog box, select one of the following and click OK.

      Hive versionDriver classname
      0.13.1
      com.amazon.hive.jdbc3.HS2Driver
      0.11.0
      org.apache.hadoop.hive.jdbc.HiveDriver.jar
      0.8.1
      org.apache.hadoop.hive.jdbc.HiveDriver.jar        
  4. When you return to the Manage Drivers dialog box, verify that the Classname field is populated and click OK.

  5. When you return to the Select Connection Profile dialog box, verify that the Driver field is set to Hive JDBC, enter one of the following in the URL field, and then click OK.

    Hive versionURL
    0.13.1 jdbc:hive2://localhost:10000/default
    0.11.0 jdbc:hive://localhost:10004/default
    0.8.1 jdbc:hive://localhost:10003/default

After the connection is complete, connection details appear at the top of the SQL Workbench/J window.

For more information about using Hive and the JDBC interface, go to http://wiki.apache.org/hadoop/Hive/HiveClient and http://wiki.apache.org/hadoop/Hive/HiveJDBCInterface.