is a verbose language, and one of the most verbose features are window functions.

In a stack overflow question that I’ve encountered recently, someone asked to calculate the difference between the first and the last value in a time series for any given day:

Input

```
value   timestamp
---------------------------
29011   2012-12-28 09:00:00
28701   2012-12-28 10:00:00
28830   2012-12-28 11:00:00
28353   2012-12-28 12:00:00
28642   2012-12-28 13:00:00
28583   2012-12-28 14:00:00
28800   2012-12-29 09:00:00
28751   2012-12-29 10:00:00
28670   2012-12-29 11:00:00
28621   2012-12-29 12:00:00
28599   2012-12-29 13:00:00
28278   2012-12-29 14:00:00
```

Desired output

```
first  last   difference  date
------------------------------------
29011  28583  428         2012-12-28
28800  28278  522         2012-12-29
```

How to write the query?

Notice that the value and timestamp progression do not correlate as it may appear. So, there is not a rule that if `Timestamp2 > Timestamp1` then `Value2 < Value1`. Otherwise, this simple query would work (using PostgreSQL syntax):

```
SELECT
max(value)              AS first,
min(value)              AS last,
max(value) - min(value) AS difference,
CAST(timestamp AS DATE) AS date
FROM t
GROUP BY CAST(timestamp AS DATE);
```

There are several ways to find the first and last values within a group that do not involve functions. For example:

• In Oracle, you can use the FIRST and LAST functions, which for some arcane reason are not written `FIRST(...) WITHIN GROUP (ORDER BY ...)` or `LAST(...) WITHIN GROUP (ORDER BY ...)`, like other sorted set aggregate functions, but `some_aggregate_function(...) KEEP (DENSE_RANK FIRST ORDER BY ...)`. Go figure
• In PostgreSQL, you could use the `DISTINCT ON` syntax along with `ORDER BY` and `LIMIT`

More details about the various approaches can be found here:
https://blog.jooq.org/2017/09/22/how-to-write-efficient-top-n-queries-in-sql

The best performing approach would be to use an aggregate function like Oracle’s, but few databases have this function. So, we’ll resort to using the `FIRST_VALUE` and `LAST_VALUE` window functions:

```
SELECT DISTINCT
first_value(volume) OVER (
PARTITION BY CAST(tstamp AS DATE)
ORDER BY tstamp
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) AS first,
last_value(volume) OVER (
PARTITION BY CAST(tstamp AS DATE)
ORDER BY tstamp
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) AS last,
first_value(volume) OVER (
PARTITION BY CAST(tstamp AS DATE)
ORDER BY tstamp
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
)
- last_value(volume) OVER (
PARTITION BY CAST(tstamp AS DATE)
ORDER BY tstamp
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) AS diff,
CAST(tstamp AS DATE) AS date
FROM t
ORDER BY CAST(tstamp AS DATE)
```

Oops 🤔

That doesn’t look too readable. But it will yield the correct result. Granted, we could wrap the definition for the columns `FIRST` and `LAST` in a derived table, but that would still leave us with two repetitions of the window definition:

```
PARTITION BY CAST(tstamp AS DATE)
ORDER BY tstamp
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
```

WINDOW clause to the rescue

Luckily, at least 3 databases have implemented the SQL standard `WINDOW` clause:

• MySQL
• PostgreSQL
• Sybase SQL Anywhere

The above query can be refactored to this one:

```
SELECT DISTINCT
first_value(volume) OVER w AS first,
last_value(volume) OVER w AS last,
first_value(volume) OVER w
- last_value(volume) OVER w AS diff,
CAST(tstamp AS DATE) AS date
FROM t
WINDOW w AS (
PARTITION BY CAST(tstamp AS DATE)
ORDER BY tstamp
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
)
ORDER BY CAST(tstamp AS DATE)
```

Notice how I can specify a window name with a window specification in a similar way as I can define a common table expression (`WITH` clause):

```
WINDOW
<window-name> AS (<window-specification>)
{  ,<window-name> AS (<window-specification>)... }
```

Not only can I reuse entire specifications, I could also build a specification from a partial specification, and reuse only parts. My previous query could have been rewritten as such:

```
SELECT DISTINCT
first_value(volume) OVER w3 AS first,
last_value(volume) OVER w3 AS last,
first_value(volume) OVER w3
- last_value(volume) OVER w3 AS diff,
CAST(tstamp AS DATE) AS date
FROM t
WINDOW
w1 AS (PARTITION BY CAST(tstamp AS DATE)),
w2 AS (w1 ORDER BY tstamp),
w3 AS (w2 ROWS BETWEEN UNBOUNDED PRECEDING
AND UNBOUNDED FOLLOWING)
ORDER BY CAST(tstamp AS DATE)
```

Each window specification can be created from scratch, or be based on a previously defined window specification. Note this is also true when referencing the window definition. If I wanted to reuse the `PARTITION BY` clause and the `ORDER BY` clause, but change the `FRAME` clause (`ROWS ...`), then I could have written this:

```
SELECT DISTINCT
first_value(volume) OVER (
w2 ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS first,
last_value(volume) OVER (
w2 ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
) AS last,
first_value(volume) OVER (
w2 ROWS BETWEEN UNBOUNDED PRECEDING
) - last_value(volume) OVER (
w2 ROWS BETWEEN 1 PRECEDING AND UNBOUNDED FOLLOWING
) AS diff,
CAST(tstamp AS DATE) AS date
FROM t
WINDOW
w1 AS (PARTITION BY CAST(tstamp AS DATE)),
w2 AS (w1 ORDER BY tstamp)
ORDER BY CAST(tstamp AS DATE)
```

What if my database doesn’t support the WINDOW clause?

In that case, you have to either manually write the window specification on each window function, or you use a SQL builder like , which can emulate the window clause:

You can try this translation online on our website: https://www.jooq.org/translate

Source link https://blog.jooq.org/2018/09/20/how-to----using-the-sql-window-clause/