Re: Row pattern recognition

From: Tatsuo Ishii <ishii(at)postgresql(dot)org>
To: assam258(at)gmail(dot)com
Cc: jian(dot)universality(at)gmail(dot)com, zsolt(dot)parragi(at)percona(dot)com, sjjang112233(at)gmail(dot)com, vik(at)postgresfriends(dot)org, er(at)xs4all(dot)nl, jacob(dot)champion(at)enterprisedb(dot)com, david(dot)g(dot)johnston(at)gmail(dot)com, peter(at)eisentraut(dot)org, li(dot)evan(dot)chao(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Row pattern recognition
Date: 2026-06-19 08:57:28
Message-ID: 20260619.175728.667662038194755204.ishii@postgresql.org
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Henson,

> Hi hackers,
>
> When a WINDOW clause is not referenced by any window function, the pattern
> is never matched and the DEFINE expressions are never evaluated.
>
> One consequence is that a DEFINE which would raise a run-time error raises
> nothing when the window is unused. The same window definition behaves
> differently depending only on whether a window function consumes it:
>
> CREATE TABLE t (id int, v int);
> INSERT INTO t VALUES (1, 10), (2, 20), (3, 15);
>
> -- (1) the window IS used (count(*) OVER w): the DEFINE is evaluated
>
> SELECT count(*) OVER w AS cnt
> FROM t
> WINDOW w AS (
> ORDER BY id
> ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
> INITIAL
> PATTERN (A+)
> DEFINE A AS (1 / (v - v)) > 0
> );
> -- ERROR: division by zero
>
> -- (2) the window is NOT used (no window function): same definition
>
> SELECT v
> FROM t
> WINDOW w AS (
> ORDER BY id
> ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
> INITIAL
> PATTERN (A+)
> DEFINE A AS (1 / (v - v)) > 0
> );
> -- v
> -- ----
> -- 10
> -- 20
> -- 15
> -- (no error)
>
> EXPLAIN (VERBOSE, COSTS OFF) of (2) confirms the window, and with it the
> DEFINE expression, are gone entirely:
>
> Seq Scan on public.t
> Output: v, id
>
> (A column-dependent division by zero is used so that the expression does not
> constant-fold and can only error at run time.)
>
> This follows from existing, general planner behavior. A window definition
> that no window function references is removed from the plan by
> select_active_windows(); the planner does this for every window, because a
> window with no consumer produces no output, so sorting or partitioning it
> would be wasted work. Before RPR that reasoning was complete: an ordinary
> window has nothing but its output, so dropping an unused one changes nothing
> observable.
>
> RPR is what turns this into a question. A DEFINE clause is a per-row
> predicate that can have a run-time effect of its own -- here, raising an
> error -- independent of any output the window produces. So the assumption
> behind the optimization, that an unused window has no observable behavior,
> no
> longer holds automatically once a window carries a DEFINE. Whether the
> existing "drop it" optimization should still extend to an RPR window is a
> decision RPR's addition forces, not something the prior behavior settles.
>
> The question is whether the current behavior is what we want:
>
> (a) Keep it. Skipping the pattern matching for a window that produces
> nothing is the natural optimization, and an expression that is never
> evaluated raising no error is normal behavior.
>
> (b) Run the matching anyway, even though there is no output to produce, so
> that DEFINE errors are raised regardless of whether a window function
> consumes the window.
>
> Option (a) is cheaper and consistent with how unevaluated expressions
> behave, but I lean towards (b): a faulty DEFINE should fail consistently
> rather than pass silently just because no window function happens to consume
> the window. I would like to hear what you think.

I think we should follow (a).

From ISO/IEC 9075-2:2016 7.15 <window clause> General Rules:
------------------------------------
1) Let SL be the <select list> of the <query specification> or <select
statement: single row> that immediately contains TE [1].

Case:

a) If SL does not simply contain a <window function>, then the <window
clause> is disregarded, and the result of TE is the result of the last
<from clause>, <where clause>, <group by clause> or <having clause> of
TE.
------------------------------------
[1] TE: Table expression

So I think the standard requires a window clause to be disregarded if
window function is not included in the select list. As DEFINE is a
part of a window clause, it should be disregarded if there's no window
function in the window clause too.

> behave, but I lean towards (b): a faulty DEFINE should fail consistently
> rather than pass silently just because no window function happens to consume
> the window. I would like to hear what you think.

But we already pass faulty window clauses. Example:

-- If window function exists, faulty window clause (invalid frame
-- ending offset) is detected.
PREPARE prep AS
SELECT count(*) OVER w
FROM generate_series(1,5) g(i)
WINDOW w AS (
ROWS BETWEEN CURRENT ROW AND $1 FOLLOWING
);
PREPARE
EXECUTE prep(-1);
psql:prepare.sql:7: ERROR: frame ending offset must not be negative

DEALLOCATE prep;
DEALLOCATE

-- But if window function does not exist, the faulty window clause
-- (invalid frame ending offset) is not detected.
PREPARE prep AS
SELECT i
FROM generate_series(1,5) g(i)
WINDOW w AS (
ROWS BETWEEN CURRENT ROW AND $1 FOLLOWING
);
PREPARE
EXECUTE prep(-1);
i
---
1
2
3
4
5
(5 rows)

I think if we detect faulty DEFINE in the last case , it's not only
against the standard but against our existing behavior.

Regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeyaprakash Rajamani 2026-06-19 09:34:00 Re: Performance Degradation (Table becomes bloat) During Repeated Bulk UPDATE Operations
Previous Message Matthias van de Meent 2026-06-19 08:26:31 Re: Multi-Entry Indexing for GiST & SP-GiST