Flattening nested arrays
When working with nested arrays, you often need to expand nested array elements into a single array, or expand the array into multiple rows.
Examples
To flatten a nested array's elements into a single array of values, use the
flatten
function. This query returns a row for each element in the
array.
SELECT flatten(ARRAY[ ARRAY[1,2], ARRAY[3,4] ]) AS items
This query returns:
+-----------+
| items |
+-----------+
| [1,2,3,4] |
+-----------+
To flatten an array into multiple rows, use CROSS JOIN
in conjunction
with the UNNEST
operator, as in this example:
WITH dataset AS ( SELECT 'engineering' as department, ARRAY['Sharon', 'John', 'Bob', 'Sally'] as users ) SELECT department, names FROM dataset CROSS JOIN UNNEST(users) as t(names)
This query returns:
+----------------------+
| department | names |
+----------------------+
| engineering | Sharon |
+----------------------|
| engineering | John |
+----------------------|
| engineering | Bob |
+----------------------|
| engineering | Sally |
+----------------------+
To flatten an array of key-value pairs, transpose selected keys into columns, as in this example:
WITH dataset AS ( SELECT 'engineering' as department, ARRAY[ MAP(ARRAY['first', 'last', 'age'],ARRAY['Bob', 'Smith', '40']), MAP(ARRAY['first', 'last', 'age'],ARRAY['Jane', 'Doe', '30']), MAP(ARRAY['first', 'last', 'age'],ARRAY['Billy', 'Smith', '8']) ] AS people ) SELECT names['first'] AS first_name, names['last'] AS last_name, department FROM dataset CROSS JOIN UNNEST(people) AS t(names)
This query returns:
+--------------------------------------+
| first_name | last_name | department |
+--------------------------------------+
| Bob | Smith | engineering |
| Jane | Doe | engineering |
| Billy | Smith | engineering |
+--------------------------------------+
From a list of employees, select the employee with the highest combined scores.
UNNEST
can be used in the FROM
clause without a preceding
CROSS JOIN
as it is the default join operator and therefore
implied.
WITH dataset AS ( SELECT ARRAY[ CAST(ROW('Sally', 'engineering', ARRAY[1,2,3,4]) AS ROW(name VARCHAR, department VARCHAR, scores ARRAY(INTEGER))), CAST(ROW('John', 'finance', ARRAY[7,8,9]) AS ROW(name VARCHAR, department VARCHAR, scores ARRAY(INTEGER))), CAST(ROW('Amy', 'devops', ARRAY[12,13,14,15]) AS ROW(name VARCHAR, department VARCHAR, scores ARRAY(INTEGER))) ] AS users ), users AS ( SELECT person, score FROM dataset, UNNEST(dataset.users) AS t(person), UNNEST(person.scores) AS t(score) ) SELECT person.name, person.department, SUM(score) AS total_score FROM users GROUP BY (person.name, person.department) ORDER BY (total_score) DESC LIMIT 1
This query returns:
+---------------------------------+
| name | department | total_score |
+---------------------------------+
| Amy | devops | 54 |
+---------------------------------+
From a list of employees, select the employee with the highest individual score.
WITH dataset AS ( SELECT ARRAY[ CAST(ROW('Sally', 'engineering', ARRAY[1,2,3,4]) AS ROW(name VARCHAR, department VARCHAR, scores ARRAY(INTEGER))), CAST(ROW('John', 'finance', ARRAY[7,8,9]) AS ROW(name VARCHAR, department VARCHAR, scores ARRAY(INTEGER))), CAST(ROW('Amy', 'devops', ARRAY[12,13,14,15]) AS ROW(name VARCHAR, department VARCHAR, scores ARRAY(INTEGER))) ] AS users ), users AS ( SELECT person, score FROM dataset, UNNEST(dataset.users) AS t(person), UNNEST(person.scores) AS t(score) ) SELECT person.name, score FROM users ORDER BY (score) DESC LIMIT 1
This query returns:
+--------------+
| name | score |
+--------------+
| Amy | 15 |
+--------------+
Considerations and limitations
-
CROSS JOIN
does not generate a Cartesian product if the main (FROM
) table has an empty array orNULL
on the specified column. -
Currently,
UNNEST
can be used only withCROSS JOIN
and does not support otherJOIN
types (for example,LEFT JOIN UNNEST
).