FIRST_VALUE window function - Amazon Redshift

FIRST_VALUE window function

Given an ordered set of rows, FIRST_VALUE returns the value of the specified expression with respect to the first row in the window frame.

For information about selecting the last row in the frame, see LAST_VALUE window function .

Syntax

FIRST_VALUE( expression )[ IGNORE NULLS | RESPECT NULLS ] OVER ( [ PARTITION BY expr_list ] [ ORDER BY order_list frame_clause ] )

Arguments

expression

The target column or expression that the function operates on.

IGNORE NULLS

When this option is used with FIRST_VALUE, the function returns the first value in the frame that is not NULL (or NULL if all values are NULL).

RESPECT NULLS

Indicates that Amazon Redshift should include null values in the determination of which row to use. RESPECT NULLS is supported by default if you do not specify IGNORE NULLS.

OVER

Introduces the window clauses for the function.

PARTITION BY expr_list

Defines the window for the function in terms of one or more expressions.

ORDER BY order_list

Sorts the rows within each partition. If no PARTITION BY clause is specified, ORDER BY sorts the entire table. If you specify an ORDER BY clause, you must also specify a frame_clause.

The results of the FIRST_VALUE function depends on the ordering of the data. The results are nondeterministic in the following cases:

  • When no ORDER BY clause is specified and a partition contains two different values for an expression

  • When the expression evaluates to different values that correspond to the same value in the ORDER BY list.

frame_clause

If an ORDER BY clause is used for an aggregate function, an explicit frame clause is required. The frame clause refines the set of rows in a function's window, including or excluding sets of rows in the ordered result. The frame clause consists of the ROWS keyword and associated specifiers. See Window function syntax summary.

Return type

These functions support expressions that use primitive Amazon Redshift data types. The return type is the same as the data type of the expression.

Examples

The following examples use the VENUE table from the sample TICKIT data. For more information, see Sample database.

The following example returns the seating capacity for each venue in the VENUE table, with the results ordered by capacity (high to low). The FIRST_VALUE function is used to select the name of the venue that corresponds to the first row in the frame: in this case, the row with the highest number of seats. The results are partitioned by state, so when the VENUESTATE value changes, a new first value is selected. The window frame is unbounded so the same first value is selected for each row in each partition.

For California, Qualcomm Stadium has the highest number of seats (70561), so this name is the first value for all of the rows in the CA partition.

select venuestate, venueseats, venuename, first_value(venuename) over(partition by venuestate order by venueseats desc rows between unbounded preceding and unbounded following) from (select * from venue where venueseats >0) order by venuestate; venuestate | venueseats | venuename | first_value -----------+------------+--------------------------------+------------------------------ CA | 70561 | Qualcomm Stadium | Qualcomm Stadium CA | 69843 | Monster Park | Qualcomm Stadium CA | 63026 | McAfee Coliseum | Qualcomm Stadium CA | 56000 | Dodger Stadium | Qualcomm Stadium CA | 45050 | Angel Stadium of Anaheim | Qualcomm Stadium CA | 42445 | PETCO Park | Qualcomm Stadium CA | 41503 | AT&T Park | Qualcomm Stadium CA | 22000 | Shoreline Amphitheatre | Qualcomm Stadium CO | 76125 | INVESCO Field | INVESCO Field CO | 50445 | Coors Field | INVESCO Field DC | 41888 | Nationals Park | Nationals Park FL | 74916 | Dolphin Stadium | Dolphin Stadium FL | 73800 | Jacksonville Municipal Stadium | Dolphin Stadium FL | 65647 | Raymond James Stadium | Dolphin Stadium FL | 36048 | Tropicana Field | Dolphin Stadium ...

The following example shows the use of the IGNORE NULLS option and relies on the addition of a new row to the VENUE table:

insert into venue values(2000,null,'Stanford','CA',90000);

This new row contains a NULL value for the VENUENAME column. Now repeat the FIRST_VALUE query that was shown earlier in this section:

select venuestate, venueseats, venuename, first_value(venuename) over(partition by venuestate order by venueseats desc rows between unbounded preceding and unbounded following) from (select * from venue where venueseats >0) order by venuestate; venuestate | venueseats | venuename | first_value -----------+------------+----------------------------+------------- CA | 90000 | NULL | NULL CA | 70561 | Qualcomm Stadium | NULL CA | 69843 | Monster Park | NULL ...

Because the new row contains the highest VENUESEATS value (90000) and its VENUENAME is NULL, the FIRST_VALUE function returns NULL for the CA partition. To ignore rows like this in the function evaluation, add the IGNORE NULLS option to the function argument:

select venuestate, venueseats, venuename, first_value(venuename) ignore nulls over(partition by venuestate order by venueseats desc rows between unbounded preceding and unbounded following) from (select * from venue where venuestate='CA') order by venuestate; venuestate | venueseats | venuename | first_value ------------+------------+----------------------------+------------------ CA | 90000 | NULL | Qualcomm Stadium CA | 70561 | Qualcomm Stadium | Qualcomm Stadium CA | 69843 | Monster Park | Qualcomm Stadium ...