Menu
Amazon EMR
Amazon EMR Release Guide

Use the Hive JDBC Driver

You can use popular business intelligence tools like Microsoft Excel, MicroStrategy, QlikView, and Tableau with Amazon EMR to explore and visualize your data. Many of these tools require an ODBC (Open Database Connectivity) or JDBC (Java Database Connectivity) driver. Amazon EMR supports both JDBC and ODBC connectivity.

To connect to Hive via JDBC requires you to download the JDBC driver and install a SQL client. The following example demonstrates using SQL Workbench/J to connect to Hive using JDBC.

To download JDBC drivers

To install and configure SQL Workbench

  1. Download the SQL Workbench/J client for your operating system from http://www.sql-workbench.net/downloads.html.

  2. Go to the Installing and starting SQL Workbench/J page and follow the instructions for installing SQL Workbench/J on your system.

    • Linux, Unix, Mac OS X users: In a terminal session, create an SSH tunnel to the master node of your cluster using the following command. Replace master-public-dns-name with the public DNS name of the master node and path-to-key-file with the location and file name of your Amazon EC2 private key (.pem) file.

      Hive version Command
      1.0 ssh -o ServerAliveInterval=10 -i path-to-key-file -N -L 10000:localhost:10000 hadoop@master-public-dns-name
    • Windows users: In a PuTTY session, create an SSH tunnel to the master node of your cluster (using local port forwarding) with the following settings. Replace master-public-dns-name with the public DNS name of the master node. For more information about creating an SSH tunnel to the master node, see Option 1: Set Up an SSH Tunnel to the Master Node Using Local Port Forwarding in the Amazon EMR Management Guide.

      Hive version Tunnel settings
      1.0 Source port: 10000 Destination: master-public-dns-name:10000
  3. Add the JDBC driver to SQL Workbench/J.

    1. In the Select Connection Profile dialog box, click Manage Drivers.

    2. Click the Create a new entry (blank page) icon.

    3. In the Name field, type Hive JDBC.

    4. For Library, click the Select the JAR file(s) icon.

    5. Browse to the location containing the extracted drivers, select the following JAR files and click Open.

      Copy
      hive_metastore.jar hive_service.jar HiveJDBC41.jar libfb303-0.9.0.jar libthrift-0.9.0.jar log4j-1.2.14.jar ql.jar slf4j-api-1.5.11.jar slf4j-log4j12-1.5.11.jar TCLIServiceClient.jar zookeeper-3.4.6.jar
    6. In the Please select one driver dialog box, select the following driver and click OK.

      Copy
      com.amazon.hive.jdbc41.HS2Driver
  4. When you return to the Manage Drivers dialog box, verify that the Classname field is populated and click OK.

  5. When you return to the Select Connection Profile dialog box, verify that the Driver field is set to Hive JDBC and provide the JDBC connection string in the URL field.

    jdbc:hive2://localhost:10000/default

  6. Click OK to connect. After the connection is complete, connection details appear at the top of the SQL Workbench/J window.

For more information about using Hive and the JDBC interface, see HiveClient and HiveJDBCInterface in Apache Hive documentation.