Querying restored Amazon S3 Glacier objects - Amazon Athena

Querying restored Amazon S3 Glacier objects

You can use Athena to query restored objects from the S3 Glacier Flexible Retrieval (formerly Glacier) and S3 Glacier Deep Archive Amazon S3 storage classes. You must enable this capability on a per-table basis. If you do not enable the feature on a table before you run a query, Athena skips all of the table's S3 Glacier Flexible Retrieval and S3 Glacier Deep Archive objects during query execution.

Considerations and Limitations

  • Querying restored Amazon S3 Glacier objects is supported only on Athena engine version 3.

  • The feature is supported only for Apache Hive tables.

  • You must restore your objects before you query your data; Athena does not restore objects for you.

Configuring a table to use restored objects

To configure your Athena table to include restored objects in your queries, you must set its read_restored_glacier_objects table property to true. To do this, you can use the Athena query editor or the AWS Glue console. You can also use the AWS Glue CLI, the AWS Glue API, or the AWS Glue SDK.

Using the Athena query editor

In Athena, you can use the ALTER TABLE SET TBLPROPERTIES command to set the table property, as in the following example.

ALTER TABLE table_name SET TBLPROPERTIES ('read_restored_glacier_objects' = 'true')

Using the AWS Glue console

In the AWS Glue console, perform the following steps to add the read_restored_glacier_objects table property.

To configure table properties in the AWS Glue console
  1. Sign in to the AWS Management Console and open the AWS Glue console at https://console.aws.amazon.com/glue/.

  2. Do one of the following:

    • Choose Go to the Data Catalog.

    • In the navigation pane, choose Data Catalog tables.

  3. On the Tables page, in the list of tables, choose the link for the table that you want to edit.

  4. Choose Actions, Edit table.

  5. On the Edit table page, in the Table properties section, add the following key-value pair.

    • For Key, add read_restored_glacier_objects.

    • For Value, enter true.

  6. Choose Save.

Using the AWS CLI

In the AWS CLI, you can use the AWS Glue update-table command and its --table-input argument to redefine the table and in so doing add the read_restored_glacier_objects property. In the --table-input argument, use the Parameters structure to specify the read_restored_glacier_objects property and the value of true. Note that the argument for --table-input must not have spaces and must use backslashes to escape the double quotes. In the following example, replace my_database and my_table with the name of your database and table.

aws glue update-table \ --database-name my_database \ --table-input={\"Name\":\"my_table\",\"Parameters\":{\"read_restored_glacier_objects\":\"true\"}}

The AWS Glue update-table command works in overwrite mode, which means that it replaces the existing table definition with the new definition specified by the table-input parameter. For this reason, be sure to also specify all of the fields that you want to be in your table in the table-input parameter when you add the read_restored_glacier_objects property.