|« PreviousNext »|
|Did this page help you? Yes | No | Tell us about it...|
The Hive metastore stores the metadata for Hive tables and partitions. Amazon EMR uses a MySQL database to contain the metastore.
By default, Amazon EMR creates the MySQL database on the master node. In this scenario, the metastore is deleted when the cluster terminates. For the metastore to persist between clusters, you can specify that the Hive cluster use a remote metastore, such as a MySQL database hosted on Amazon RDS. For more information about how to create a remote metastore, see Create a Metastore Outside the Hadoop Cluster.
If you create a new cluster using Hive 0.11 and let it create a new metastore on the master node (the default behavior), it will have the new schema. No updates are required.
If you have an existing metastore, created with Hive 0.8 or earlier, that you want to reuse, you must update the schema to the Hive 0.11 format. Apache Hive provides scripts you can use to update metastore schemas from one version to another; their use is explained in the following procedures.
If you are updating a metastore created with a version of Hive prior to 0.8, this may require multiple steps. For example, a metastore created with Hive 0.6 would first need to be updated to the Hive 0.7 schema, then the Hive 0.8 schema, before it could be updated to the Hive 0.11 schema. There were no schema changes from version 0.8 to 0.9 and 0.10 to 0.11. So, if you are upgrading from 0.8 to 0.11, running the upgrade script from 0.9 to 0.10 is sufficient. For a list of the Apache update scripts for previous versions of Hive, go to http://svn.apache.org/viewvc/hive/branches/branch-0.8/metastore/scripts/upgrade/mysql/.
The transformation scripts only work in one direction. After you've converted your Hive metastore to the Hive 0.11 format, you cannot use the scripts to convert the metastore back to the Hive 0.8 format. It is recommended that you back up your metastore before you begin the upgrade process.