Amazon Elastic MapReduce
Developer Guide (API Version 2009-03-31)
« PreviousNext »
View the PDF for this guide.Go to the AWS Discussion Forum for this product.Go to the Kindle Store to download this guide in Kindle format.Did this page help you?  Yes | No |  Tell us about it...

Upgrade to Hive 0.11 (MySQL on the Master Node)

The following procedures explain how to upgrade a Hive metastore stored in a MySQL database hosted on the master node of a cluster (the default behavior).

If your metastore is stored remotely, as described in Create a Metastore Outside the Hadoop Cluster, please use the instructions at Upgrade to Hive 0.11 (MySQL on Amazon RDS) instead.

To upgrade the metastore from 0.8 to 0.11 (MySQL on the master node)

  1. Stop running Hive processes during the upgrade procedure. This ensures that the database is not altered while the upgrade scripts are running.

  2. Use SSH to connect to the master node of the Hive cluster to upgrade. For more information about how to use SSH to connect to the master node, see Connect to the Master Node Using SSH.

  3. Copy the upgrade scripts from Apache to the master node. You can do this by running the wget utility on the master node. (The view=co request variable in the following URLs ensures that you get the plain-text version of the scripts, not the HTML-encoded version.)

     wget -O 010-HIVE-3072.mysql.sql http://svn.apache.org/viewvc/hive/branches/branch-0.11/metastore/scripts/upgrade/mysql/010-HIVE-3072.mysql.sql?view=co
     wget -O 011-HIVE-3649.mysql.sql http://svn.apache.org/viewvc/hive/branches/branch-0.11/metastore/scripts/upgrade/mysql/011-HIVE-3649.mysql.sql?view=co  
     wget -O 012-HIVE-1362.mysql.sql http://svn.apache.org/viewvc/hive/branches/branch-0.11/metastore/scripts/upgrade/mysql/012-HIVE-1362.mysql.sql?view=co
     wget -O upgrade-0.9.0-to-0.10.0.mysql.sql http://svn.apache.org/viewvc/hive/branches/branch-0.11/metastore/scripts/upgrade/mysql/upgrade-0.9.0-to-0.10.0.mysql.sql?view=co
    				
  4. Launch the MySQL Monitor using the following command.

    mysql -u root
    				
  5. From the MySQL Monitor command prompt, find the name of the database that contains your Hive metastore. You can do this by running the following command.

    mysql> show databases;				
    				

    This returns results such as the following. The database that starts with "hive_" is the one that contains your Hive MetaStore. In the example below, this is "hive_081".

    +--------------------+
    | Database           |
    +--------------------+
    | information_schema |
    | InstanceController |
    | hive_081           |
    | mysql              |
    +--------------------+
    4 rows in set (0.00 sec)
    				
  6. Exit MySQL Monitor.

    mysql> exit
    				
  7. Back up your MySQL metastore database. You can do this using the mysqldump utility, as shown in the example below, where hive_081 is the name of the database containing the metastore.

    mysqldump --opt hive_081 > metastore_backup.sql -u root					
    				
  8. Export the current metastore schema to a file. The following mysqldump command extracts only the schema content.

    mysqldump --skip-add-drop-table --no-data \
       hive_081 > my-schema-0.8.1.mysql.sql -u root
    				
  9. Compare your current schema against the official Apache Hive schema listed at http://svn.apache.org/viewvc/hive/branches/branch-0.11/metastore/scripts/upgrade/mysql/. If you are upgrading from 0.8 to 0.11, compare the schema you exported against the version in hive-schema-0.8.0.mysql.sql. The schemas should match. If you have made custom changes to the schema, you may need to roll those back in order for the upgrade scripts to work properly.

    Differences you may find:

    • Missing tables. By default, Hive only creates schema elements when they are used. If you have not created a certain type of Hive catalog object, the corresponding table will not exist in your schema. You must create these missing tables for the upgrade script to succeed. You can do this by hand or by running the official schema DDL script (located at http://svn.apache.org/viewvc/hive/branches/branch-0.11/metastore/scripts/upgrade/mysql/hive-schema-0.8.0.mysql.sql?view=co) against your metastore. This script ignores tables that already exist, and will only create those that are missing.

    • Extra tables. If your schema contains tables named NUCLEUS_TABLES or SEQUENCE_TABLE, these will not affect the upgrade script. You do not need to remove them.

    • Reversed Column Constraint Names. If a table has multiple constraints, the names may be reversed between your schema and the canonical schema. For example, if a table contains PARTITIONS_FK1 and PARTITIONS_FK2 which reference SDS.SD_ID and TBLS.TBL_ID, your schema may instead connect PARTITIONS_FK1 to TBLS.TBL_ID and PARTITIONS_FK2 to SDS.SD_ID. This will not affect the upgrade script and can be ignored.

    • Changes in Column and Constraint Names. If your schema contains tables with unique keys named "UNIQUE<tab_name>" or columns named "IDX" you will need to rename these to "UNIQUE_<tab_name>" and "INTEGER_IDX" before you run the upgrade script. The reason for this is explained in HIVE-1435.

  10. Launch the MySQL Monitor using the following command.

    mysql -u root
    				
  11. Run upgrade-0.9.0-to-0.10.0.mysql.sql script using the source command in the MySQL command line. This is shown in the following example, where hive_081 is the database containing the metastore.

    mysql> use hive_081;
    mysql> source upgrade-0.9.0-to-0.10.0.mysql.sql;
    				

    The script should complete without error.

  12. Exit MySQL Monitor.

    mysql> exit
    				
  13. Export the upgraded metastore schema to a file. The following mysqldump command extracts only the schema content.

    sudo mysqldump --skip-add-drop-table --no-data \
       hive_081 > my-schema-0.11.0.mysql.sql
    				
  14. Compare this schema to the official Apache Hive 0.11 schema: http://svn.apache.org/viewvc/hive/branches/branch-0.11/metastore/scripts/upgrade/mysql/hive-schema-0.8.0.mysql.sql. They should match.

  15. Back up the upgraded metastore.

    mysqldump --opt hive_081 > metastore_upgraded_backup.sql \
       -u root
    				

To move your upgraded configuration files and metastore to a Hive 0.11 cluster (MySQL on the master node)

  1. Create a new cluster on Hadoop 1.0.3 (AMI version 2.2 or later).

  2. Use SSH to connect to the master node. For more information about how to do this, see Connect to the Master Node Using SSH.

  3. Use the scp utility to copy your upgraded configuration files and upgraded metastore backup file to the new cluster.

  4. Copy custom configuration settings from hive-default.xml and hive-site.xml to the hive-site.xml on the new cluster (found on the master node at hive/conf/.) Take care not to overwrite any settings used by Amazon EMR.

  5. Import the upgraded metastore to the MySQL database (whether locally on the master, or on a remote server.)

    mysql -u root
    
    mysql>show databases;
    				

    This returns results such as the following. The database that starts with "hive_" is the one that contains your Hive MetaStore. In the example below, this is "hive_0110".

    +--------------------+
    | Database           |
    +--------------------+
    | information_schema |
    | InstanceController |
    | hive_0110           |
    | mysql              |
    +--------------------+
    4 rows in set (0.00 sec)
    				
  6. Replace this empty metastore, hive_0110 in the preceeding example, with your upgraded metastore. You can use the following commands to move your database to the new location.

    mysql> drop database hive_0110;
    mysql> create database hive_0110;
    mysql> use database hive_0110;
    mysql> source metastore_upgraded_backup.sql;