Re: Row pattern recognition

From: Henson Choi <assam258(at)gmail(dot)com>
To: ishii(at)postgresql(dot)org, jian(dot)universality(at)gmail(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org, zsolt(dot)parragi(at)percona(dot)com, sjjang112233(at)gmail(dot)com, vik(at)postgresfriends(dot)org, er(at)xs4all(dot)nl, jacob(dot)champion(at)enterprisedb(dot)com, david(dot)g(dot)johnston(at)gmail(dot)com, peter(at)eisentraut(dot)org, li(dot)evan(dot)chao(at)gmail(dot)com
Subject: Re: Row pattern recognition
Date: 2026-07-03 22:22:10
Message-ID: CAAAe_zBMvzn6ZwmTPhsi0mP6FPVv3hZ2cb1Xe3KD8brT5EPTFg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Tatsuo,

> In v50-0006-tidy-plumbing.patch, the planner cost model seems slightly
> changed. Before the cost was charged according to the number of
> pattern variables in PATTERN clause. But now it is charged according
> to the number of pattern variables in DEFINE clause. Maybe I missed
> the discussion on the changing. Can you please explain the reason of
> the change?

The change is a simplification that came out of Jian's v48 review, and it
does not change the cost the model produces.

It follows Jian's suggestion from 2026-06-15 [1]:

> collectPatternVariables is not needed.
> The parser already ensures every DEFINE variable appears in PATTERN,
> so there is nothing to filter.
> Also, we don't really do anything special (like make a dummy Const)
> regarding PATTERN variables that not appearing in the DEFINE clause.

The two loops charge exactly the same set, for the following reason:

- The old loop walked the unique PATTERN variables (collectPatternVariables
deduplicates, returning each name once) and, for each, looked up the
matching DEFINE entry by resname and charged that DEFINE's cost.
- Every DEFINE variable is guaranteed to appear in PATTERN -- the parser
rejects a DEFINE variable that is not used in PATTERN (errmsg "DEFINE
variable \"%s\" is not used in PATTERN" in parse_rpr.c). So the DEFINE
clause is always a subset of the unique PATTERN variables, and each
DEFINE resname is unique.
- A PATTERN variable that has no DEFINE contributes nothing to the old
loop, because the inner resname lookup finds no match.

So the old loop already charged each DEFINE expression exactly once, and
nothing else. Iterating defineClause directly, as v50-0006 does, visits
precisely that same set once each. The estimate is unchanged; only the
redundant outer walk over PATTERN and the per-variable resname lookup into
the DEFINE clause are removed.

This also matches the premise of the cost model we settled on back in
February: the NFA executor evaluates every DEFINE expression once per row,
so the natural unit for the per-tuple charge is the DEFINE variable.

[1]
https://www.postgresql.org/message-id/CACJufxFAQhbOD9EVCTAy-VwDbG4446N10GsxCcgdpFnjHO1Efw%40mail.gmail.com

Best regards,
Henson

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jelte Fennema-Nio 2026-07-03 22:28:24 Re: Don't use the deprecated and insecure PQcancel in our frontend tools anymore
Previous Message Alexander Korotkov 2026-07-03 22:00:16 Bug in asynchronous Append