Run JDBC commands in Zeppelin - Teaching Big Data Skills with Amazon EMR

Run JDBC commands in Zeppelin

Since we are using Hive properties for the JDBC interpreter, the notebook command will always start with %jdbc (hive).

Creating databases and tables – these can be stored in /user/hive/warehouse or in a specified directory such as /user/student02/ (using the location parameter) — as long the logged in user has access to those directories. Default is always /user/hive/warehouse. In both location cases, the user that created the databases/tables is the one that will have access as the user is the owner of those files (per HDFS ACLs).

Sample example jdbc notebook note below. This creates a database testdb01 and then creates a table table01 in database testdb01. Additionally, stores the database + table into the /user/student02 location in HDFS.

%jdbc (hive) create database testdb01 location '/user/student02/'; CREATE TABLE testdb01.table01 ( person_id string, gender string, birthdate string, uuid string, fullname string, given_name string, middle_name string, family_name string, family_name2 string, family_name_suffix string );

To load data to this table from a file, you can use the below command:

%jdbc (hive) LOAD DATA INPATH '/user/student02/file_sample.csv' OVERWRITE INTO TABLE testdb01.table01;