JSON SerDe libraries

In Athena, you can use SerDe libraries to deserialize JSON data. Deserialization converts the JSON data so that it can be serialized (written out) into a different format like Parquet or ORC.

Note

The Hive and OpenX libraries expect JSON data to be on a single line (not formatted), with records separated by a new line character. The Amazon Ion Hive SerDe does not have that requirement and can be used as an alternative because the Ion data format is a superset of JSON.

Library names

Use one of the following:

org.apache.hive.hcatalog.data.JsonSerDe

org.openx.data.jsonserde.JsonSerDe

com.amazon.ionhiveserde.IonHiveSerDe

Additional resources

For more information about working with JSON and nested JSON in Athena, see the following resources:

Create tables in Amazon Athena from nested JSON and mappings using JSONSerDe (AWS Big Data Blog)
I get errors when I try to read JSON data in Amazon Athena (AWS Knowledge Center article)
hive-json-schema (GitHub) – Tool written in Java that generates CREATE TABLE statements from example JSON documents. The CREATE TABLE statements that are generated use the OpenX JSON Serde.

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Grok SerDe

Hive JSON SerDe