Using aggregation functions with arrays - Amazon Athena

Using aggregation functions with arrays

  • To add values within an array, use SUM, as in the following example.

  • To aggregate multiple rows within an array, use array_agg. For information, see Creating arrays from subqueries.

Note

ORDER BY is supported for aggregation functions starting in Athena engine version 2.

WITH dataset AS ( SELECT ARRAY [ ARRAY[1,2,3,4], ARRAY[5,6,7,8], ARRAY[9,0] ] AS items ), item AS ( SELECT i AS array_items FROM dataset, UNNEST(items) AS t(i) ) SELECT array_items, sum(val) AS total FROM item, UNNEST(array_items) AS t(val) GROUP BY array_items;

In the last SELECT statement, instead of using sum() and UNNEST, you can use reduce() to decrease processing time and data transfer, as in the following example.

WITH dataset AS ( SELECT ARRAY [ ARRAY[1,2,3,4], ARRAY[5,6,7,8], ARRAY[9,0] ] AS items ), item AS ( SELECT i AS array_items FROM dataset, UNNEST(items) AS t(i) ) SELECT array_items, reduce(array_items, 0 , (s, x) -> s + x, s -> s) AS total FROM item;

Either query returns the following results. The order of returned results is not guaranteed.

+----------------------+ | array_items | total | +----------------------+ | [1, 2, 3, 4] | 10 | | [5, 6, 7, 8] | 26 | | [9, 0] | 9 | +----------------------+