Flatten nested arrays

Focus mode

Flatten nested arrays - Amazon Athena

Use the flatten function Use CROSS JOIN and UNNEST

When working with nested arrays, you often need to expand nested array elements into a single array, or expand the array into multiple rows.

Use the flatten function

To flatten a nested array's elements into a single array of values, use the flatten function. This query returns a row for each element in the array.


SELECT flatten(ARRAY[ ARRAY[1,2], ARRAY[3,4] ]) AS items

This query returns:


+-----------+
| items     |
+-----------+
| [1,2,3,4] |
+-----------+

Use CROSS JOIN and UNNEST

To flatten an array into multiple rows, use CROSS JOIN in conjunction with the UNNEST operator, as in this example:


WITH dataset AS (
  SELECT
    'engineering' as department,
    ARRAY['Sharon', 'John', 'Bob', 'Sally'] as users
)
SELECT department, names FROM dataset
CROSS JOIN UNNEST(users) as t(names)

This query returns:


+----------------------+
| department  | names  |
+----------------------+
| engineering | Sharon |
+----------------------|
| engineering | John   |
+----------------------|
| engineering | Bob    |
+----------------------|
| engineering | Sally  |
+----------------------+

To flatten an array of key-value pairs, transpose selected keys into columns, as in this example:


WITH
dataset AS (
  SELECT
    'engineering' as department,
     ARRAY[
      MAP(ARRAY['first', 'last', 'age'],ARRAY['Bob', 'Smith', '40']),
      MAP(ARRAY['first', 'last', 'age'],ARRAY['Jane', 'Doe', '30']),
      MAP(ARRAY['first', 'last', 'age'],ARRAY['Billy', 'Smith', '8'])
     ] AS people
  )
SELECT names['first'] AS
 first_name,
 names['last'] AS last_name,
 department FROM dataset
CROSS JOIN UNNEST(people) AS t(names)

This query returns:


+--------------------------------------+
| first_name | last_name | department  |
+--------------------------------------+
| Bob        | Smith     | engineering |
| Jane       | Doe       | engineering |
| Billy      | Smith     | engineering |
+--------------------------------------+

From a list of employees, select the employee with the highest combined scores. UNNEST can be used in the FROM clause without a preceding CROSS JOIN as it is the default join operator and therefore implied.


WITH
dataset AS (
  SELECT ARRAY[
    CAST(ROW('Sally', 'engineering', ARRAY[1,2,3,4]) AS ROW(name VARCHAR, department VARCHAR, scores ARRAY(INTEGER))),
    CAST(ROW('John', 'finance', ARRAY[7,8,9]) AS ROW(name VARCHAR, department VARCHAR, scores ARRAY(INTEGER))),
    CAST(ROW('Amy', 'devops', ARRAY[12,13,14,15]) AS ROW(name VARCHAR, department VARCHAR, scores ARRAY(INTEGER)))
  ] AS users
),
users AS (
 SELECT person, score
 FROM
   dataset,
   UNNEST(dataset.users) AS t(person),
   UNNEST(person.scores) AS t(score)
)
SELECT person.name, person.department, SUM(score) AS total_score FROM users
GROUP BY (person.name, person.department)
ORDER BY (total_score) DESC
LIMIT 1

This query returns:


+---------------------------------+
| name | department | total_score |
+---------------------------------+
| Amy  | devops     | 54          |
+---------------------------------+

From a list of employees, select the employee with the highest individual score.


WITH
dataset AS (
 SELECT ARRAY[
   CAST(ROW('Sally', 'engineering', ARRAY[1,2,3,4]) AS ROW(name VARCHAR, department VARCHAR, scores ARRAY(INTEGER))),
   CAST(ROW('John', 'finance', ARRAY[7,8,9]) AS ROW(name VARCHAR, department VARCHAR, scores ARRAY(INTEGER))),
   CAST(ROW('Amy', 'devops', ARRAY[12,13,14,15]) AS ROW(name VARCHAR, department VARCHAR, scores ARRAY(INTEGER)))
 ] AS users
),
users AS (
 SELECT person, score
 FROM
   dataset,
   UNNEST(dataset.users) AS t(person),
   UNNEST(person.scores) AS t(score)
)
SELECT person.name, score FROM users
ORDER BY (score) DESC
LIMIT 1

This query returns:


+--------------+
| name | score |
+--------------+
| Amy  | 15    |
+--------------+

Considerations for CROSS JOIN and UNNEST

If UNNEST is used on one or more arrays in the query, and one of the arrays is NULL, the query returns no rows. If UNNEST is used on an array that is an empty string, the empty string is returned.

For example, in the following query, because the second array is null, the query returns no rows.


SELECT 
    col1, 
    col2 
FROM UNNEST (ARRAY ['apples','oranges','lemons']) AS t(col1)
CROSS JOIN UNNEST (ARRAY []) AS t(col2)

In this next example, the second array is modified to contain an empty string. For each row, the query returns the value in col1 and an empty string for the value in col2. The empty string in the second array is required in order for the values in the first array to be returned.


SELECT 
    col1, 
    col2 
FROM UNNEST (ARRAY ['apples','oranges','lemons']) AS t(col1)
CROSS JOIN UNNEST (ARRAY ['']) AS t(col2)

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Access array elements

Create arrays from subqueries

Select your cookie preferences

Customize cookie preferences

Essential

Performance

Functional

Advertising

Unable to save cookie preferences

Flatten nested arrays

Use the flatten function

Use CROSS JOIN and UNNEST

Considerations for CROSS JOIN and UNNEST

On this page

Did this page help you?

Next topic:

Previous topic:

Need help?