Flatten nested arrays
When working with nested arrays, you often need to expand nested array elements into a single array, or expand the array into multiple rows.
Use the flatten function
To flatten a nested array's elements into a single array of values, use the
flatten
function. This query returns a row for each element in the
array.
SELECT flatten(ARRAY[ ARRAY[1,2], ARRAY[3,4] ]) AS items
This query returns:
+-----------+
| items |
+-----------+
| [1,2,3,4] |
+-----------+
Use CROSS JOIN and UNNEST
To flatten an array into multiple rows, use CROSS JOIN
in conjunction
with the UNNEST
operator, as in this example:
WITH dataset AS ( SELECT 'engineering' as department, ARRAY['Sharon', 'John', 'Bob', 'Sally'] as users ) SELECT department, names FROM dataset CROSS JOIN UNNEST(users) as t(names)
This query returns:
+----------------------+
| department | names |
+----------------------+
| engineering | Sharon |
+----------------------|
| engineering | John |
+----------------------|
| engineering | Bob |
+----------------------|
| engineering | Sally |
+----------------------+
To flatten an array of key-value pairs, transpose selected keys into columns, as in this example:
WITH dataset AS ( SELECT 'engineering' as department, ARRAY[ MAP(ARRAY['first', 'last', 'age'],ARRAY['Bob', 'Smith', '40']), MAP(ARRAY['first', 'last', 'age'],ARRAY['Jane', 'Doe', '30']), MAP(ARRAY['first', 'last', 'age'],ARRAY['Billy', 'Smith', '8']) ] AS people ) SELECT names['first'] AS first_name, names['last'] AS last_name, department FROM dataset CROSS JOIN UNNEST(people) AS t(names)
This query returns:
+--------------------------------------+
| first_name | last_name | department |
+--------------------------------------+
| Bob | Smith | engineering |
| Jane | Doe | engineering |
| Billy | Smith | engineering |
+--------------------------------------+
From a list of employees, select the employee with the highest combined scores.
UNNEST
can be used in the FROM
clause without a preceding
CROSS JOIN
as it is the default join operator and therefore
implied.
WITH dataset AS ( SELECT ARRAY[ CAST(ROW('Sally', 'engineering', ARRAY[1,2,3,4]) AS ROW(name VARCHAR, department VARCHAR, scores ARRAY(INTEGER))), CAST(ROW('John', 'finance', ARRAY[7,8,9]) AS ROW(name VARCHAR, department VARCHAR, scores ARRAY(INTEGER))), CAST(ROW('Amy', 'devops', ARRAY[12,13,14,15]) AS ROW(name VARCHAR, department VARCHAR, scores ARRAY(INTEGER))) ] AS users ), users AS ( SELECT person, score FROM dataset, UNNEST(dataset.users) AS t(person), UNNEST(person.scores) AS t(score) ) SELECT person.name, person.department, SUM(score) AS total_score FROM users GROUP BY (person.name, person.department) ORDER BY (total_score) DESC LIMIT 1
This query returns:
+---------------------------------+
| name | department | total_score |
+---------------------------------+
| Amy | devops | 54 |
+---------------------------------+
From a list of employees, select the employee with the highest individual score.
WITH dataset AS ( SELECT ARRAY[ CAST(ROW('Sally', 'engineering', ARRAY[1,2,3,4]) AS ROW(name VARCHAR, department VARCHAR, scores ARRAY(INTEGER))), CAST(ROW('John', 'finance', ARRAY[7,8,9]) AS ROW(name VARCHAR, department VARCHAR, scores ARRAY(INTEGER))), CAST(ROW('Amy', 'devops', ARRAY[12,13,14,15]) AS ROW(name VARCHAR, department VARCHAR, scores ARRAY(INTEGER))) ] AS users ), users AS ( SELECT person, score FROM dataset, UNNEST(dataset.users) AS t(person), UNNEST(person.scores) AS t(score) ) SELECT person.name, score FROM users ORDER BY (score) DESC LIMIT 1
This query returns:
+--------------+
| name | score |
+--------------+
| Amy | 15 |
+--------------+
Considerations for CROSS JOIN and UNNEST
If UNNEST
is used on one or more arrays in the query, and one of the
arrays is NULL
, the query returns no rows. If UNNEST
is
used on an array that is an empty string, the empty string is returned.
For example, in the following query, because the second array is null, the query returns no rows.
SELECT col1, col2 FROM UNNEST (ARRAY ['apples','oranges','lemons']) AS t(col1) CROSS JOIN UNNEST (ARRAY []) AS t(col2)
In this next example, the second array is modified to contain an empty string. For
each row, the query returns the value in col1
and an empty string for
the value in col2
. The empty string in the second array is required in
order for the values in the first array to be returned.
SELECT col1, col2 FROM UNNEST (ARRAY ['apples','oranges','lemons']) AS t(col1) CROSS JOIN UNNEST (ARRAY ['']) AS t(col2)