Window Functions in SQL: A Working Cheatsheet

Window functions let you compute a value for each row while still seeing other rows in the dataset. ROW_NUMBER, running totals, period-over-period deltas, top-N per group: all of these collapse into a single SELECT with OVER(). Without window functions you reach for self-joins or Python. With them, the same logic fits on two lines. Running totals, rank within group, period-over-period deltas, top-N per category — these all collapse into a single SELECT with OVER(...). Without window functions you reach for self-joins, correlated subqueries, or a Python loop. With them, the same logic fits on two lines and the planner can use an index.

In This Article

Anatomy of OVER(...)
The three families
A worked example: customer cohort revenue
Common production gotchas
Takeaway

This is the reference we’ve sent to every analyst we’ve onboarded since 2020, expanded for 2026.

Anatomy of `OVER(...)`

Every window function follows the same shape:

function_name(args) OVER (
  PARTITION BY <columns>   -- optional: split rows into independent windows
  ORDER BY    <columns>    -- optional: sequence rows inside each window
  <frame_clause>           -- optional: which rows count as "the window"
)

The three optional pieces interact:

PARTITION BY splits the dataset into independent groups. The function reruns inside each group. Without it, the whole table is one group.
ORDER BY sorts rows inside each partition. Without it, the order is engine-dependent and any function that cares about order (ROW_NUMBER, LAG, running totals) gives nondeterministic results.
The frame clause picks which rows in the partition contribute to the current row’s value. The default depends on whether you wrote ORDER BY — and the default is rarely what you want for running calculations. We’ll come back to this.

The three families

There are dozens of window functions but only three categories matter in practice.

1. Ranking functions

Assign each row a position within its partition.

Function	What it returns
`ROW_NUMBER()`	1, 2, 3 — strictly increasing, no ties
`RANK()`	1, 2, 2, 4 — ties get the same rank, next rank skips
`DENSE_RANK()`	1, 2, 2, 3 — ties get the same rank, next rank does NOT skip
`NTILE(n)`	Splits the partition into `n` equal-sized buckets, returns 1..n
`PERCENT_RANK()`	Relative rank in [0, 1]

Most common use: pick top-N per group.

SELECT *
FROM (
  SELECT
    customer_id,
    order_date,
    order_value,
    ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_value DESC) AS rn
  FROM orders
) ranked
WHERE rn <= 3;

This returns the three largest orders per customer. Without window functions you’d be writing a correlated subquery and watching the query planner cry.

RANK vs DENSE_RANK vs ROW_NUMBER: use ROW_NUMBER when you need exactly N results (ties are broken arbitrarily but the count is predictable). Use RANK when ties should share a rank and the next rank should skip (1, 2, 2, 4). Use DENSE_RANK when ties share a rank and the next rank should not skip (1, 2, 2, 3). In practice: top-N filtering uses ROW_NUMBER, leaderboards use DENSE_RANK.

2. Aggregate functions over a window

Every familiar aggregate — SUM, AVG, COUNT, MIN, MAX — can also be a window function. Instead of collapsing rows to one, it returns the aggregate for each row’s window.

The cleanest use is the running total:

SELECT
  order_date,
  revenue,
  SUM(revenue) OVER (ORDER BY order_date) AS running_revenue
FROM daily_revenue;

This returns a running cumulative sum, one row per date. Same shape works for averages, counts, anything aggregate.

Two ways this goes wrong:

Forgetting ORDER BY. SUM(revenue) OVER () returns the grand total on every row, not a running total. Useful for “what fraction of revenue does this row represent” but not for time-series accumulation.
Forgetting that the default frame includes all earlier rows when ORDER BY is present. Specifically, with ORDER BY the default is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW — i.e. running total. Without ORDER BY it’s ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING — i.e. the entire partition. These give wildly different answers and people regularly mix them up.

3. Offset functions

Reach into the row before or after the current one inside the same partition.

Function	What it returns
`LAG(col, n)`	Value of `col` from the row `n` positions earlier (default `n=1`)
`LEAD(col, n)`	Value of `col` from the row `n` positions later
`FIRST_VALUE(col)`	Value of `col` from the first row in the window
`LAST_VALUE(col)`	Value of `col` from the last row in the window (gotcha: frame-dependent)
`NTH_VALUE(col, n)`	Value from the n-th row

The bread-and-butter use is period-over-period delta:

SELECT
  order_date,
  revenue,
  revenue - LAG(revenue) OVER (ORDER BY order_date) AS day_over_day_delta
FROM daily_revenue;

Or month-over-month growth, with a PARTITION BY year:

SELECT
  year,
  month,
  revenue,
  revenue / NULLIF(LAG(revenue) OVER (PARTITION BY year ORDER BY month), 0) - 1 AS mom_growth
FROM monthly_revenue;

LAST_VALUE is the trap. By default, the frame ends at the current row, so LAST_VALUE(col) OVER (ORDER BY x) returns the current row’s value, not the last row of the partition. To actually get the last row, you have to extend the frame:

LAST_VALUE(col) OVER (
  ORDER BY x
  ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
)

In practice: use FIRST_VALUE for the first row, MAX(col) OVER (PARTITION BY ...) for the last. Both avoid the LAST_VALUE frame trap and are readable by anyone who hasn’t memorized the frame-clause default.

A worked example: customer cohort revenue

Suppose you have an orders table with customer_id, signup_date, order_date, order_value. You want:

Each customer’s first order
Each customer’s running total spend
The customer’s rank within their signup cohort by lifetime spend

One query:

SELECT
  customer_id,
  signup_date,
  order_date,
  order_value,
  FIRST_VALUE(order_date) OVER (
    PARTITION BY customer_id ORDER BY order_date
  ) AS first_order_date,
  SUM(order_value) OVER (
    PARTITION BY customer_id ORDER BY order_date
  ) AS running_spend,
  DENSE_RANK() OVER (
    PARTITION BY DATE_TRUNC('month', signup_date)
    ORDER BY SUM(order_value) OVER (PARTITION BY customer_id) DESC
  ) AS cohort_spend_rank
FROM orders;

Three window functions, three different partitions, one pass through the data. The same logic done with joins is 40 lines and 10 times slower.

Common production gotchas

A few we’ve hit in client engagements:

High-cardinality PARTITION BY. Partitioning by user_id on a 500M-row events table can blow up memory if the planner doesn’t choose a hash partitioning strategy. Test on a sample before assuming it’ll fit.
Window functions can’t appear in a WHERE clause. You need a subquery or CTE to filter on the result (see the top-N pattern above). They also can’t appear in GROUP BY directly — they execute after GROUP BY.
ORDER BY ties. If your partition has 2,000 rows per date and you ORDER BY date, the order of rows within a date is engine-dependent. ROW_NUMBER will assign different numbers on different runs. Always add a tiebreaker: ORDER BY date, id.
Frame clauses default differently per engine. PostgreSQL, ClickHouse, BigQuery, and Snowflake all behave the same on the standard, but legacy MS SQL Server and older MySQL have quirks. When porting queries between engines, spot-check.
Window functions on ClickHouse have engine-specific gotchas. The groupArrayMovingSum + ARRAY JOIN pattern is sometimes faster than SUM() OVER ORDER BY and sometimes slower; benchmark on your data. See Cumulative Sum in SQL for the comparison.

Takeaway

Three things to remember when you reach for window functions:

The three families are ranking, aggregate-over-window, and offset. Once you can name which family fits the question, the syntax is mechanical.
ORDER BY inside the OVER clause is almost always required and almost always needs an explicit tiebreaker. Without it, results are nondeterministic.
Frame clauses default to “everything up to current row” with ORDER BY and “the entire partition” without. Knowing which one you’re getting is the difference between a running total and a grand total.

If you find yourself writing a self-join or correlated subquery in 2026, take 5 minutes and try expressing it as a window function first. About 4 times out of 5, it’s cleaner and faster.

Window Functions in SQL: The Cheatsheet That Actually Made Sense

Anatomy of `OVER(...)`

The three families

1. Ranking functions

2. Aggregate functions over a window

3. Offset functions

A worked example: customer cohort revenue

Common production gotchas

Takeaway

Keep reading

Let's Discuss Your Project

Window Functions in SQL: The Cheatsheet That Actually Made Sense

Anatomy of OVER(...)

The three families

1. Ranking functions

2. Aggregate functions over a window

3. Offset functions

A worked example: customer cohort revenue

Common production gotchas

Takeaway

Keep reading

Enjoyed this article?

Let's Discuss Your Project

Anatomy of `OVER(...)`