Re: Row pattern recognition

From: Tatsuo Ishii <ishii(at)postgresql(dot)org>
To: assam258(at)gmail(dot)com
Cc: jian(dot)universality(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org, zsolt(dot)parragi(at)percona(dot)com, sjjang112233(at)gmail(dot)com, vik(at)postgresfriends(dot)org, er(at)xs4all(dot)nl, jacob(dot)champion(at)enterprisedb(dot)com, david(dot)g(dot)johnston(at)gmail(dot)com, peter(at)eisentraut(dot)org, li(dot)evan(dot)chao(at)gmail(dot)com
Subject: Re: Row pattern recognition
Date: 2026-07-04 07:28:41
Message-ID: 20260704.162841.39857602849942465.ishii@postgresql.org
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> The change is a simplification that came out of Jian's v48 review, and it
> does not change the cost the model produces.
>
> It follows Jian's suggestion from 2026-06-15 [1]:
>
>> collectPatternVariables is not needed.
>> The parser already ensures every DEFINE variable appears in PATTERN,
>> so there is nothing to filter.
>> Also, we don't really do anything special (like make a dummy Const)
>> regarding PATTERN variables that not appearing in the DEFINE clause.
>
> The two loops charge exactly the same set, for the following reason:
>
> - The old loop walked the unique PATTERN variables (collectPatternVariables
> deduplicates, returning each name once) and, for each, looked up the
> matching DEFINE entry by resname and charged that DEFINE's cost.
> - Every DEFINE variable is guaranteed to appear in PATTERN -- the parser
> rejects a DEFINE variable that is not used in PATTERN (errmsg "DEFINE
> variable \"%s\" is not used in PATTERN" in parse_rpr.c). So the DEFINE
> clause is always a subset of the unique PATTERN variables, and each
> DEFINE resname is unique.
> - A PATTERN variable that has no DEFINE contributes nothing to the old
> loop, because the inner resname lookup finds no match.
>
> So the old loop already charged each DEFINE expression exactly once, and
> nothing else. Iterating defineClause directly, as v50-0006 does, visits
> precisely that same set once each. The estimate is unchanged; only the
> redundant outer walk over PATTERN and the per-variable resname lookup into
> the DEFINE clause are removed.

Ok, thanks for the explanation.

> This also matches the premise of the cost model we settled on back in
> February: the NFA executor evaluates every DEFINE expression once per row,
> so the natural unit for the per-tuple charge is the DEFINE variable.

BTW, I was thinking about cases where same DEFINE variable appears
twice or more in PATTERN for a same row. For example PATTERN
(A|A). But in this case it would be optimized out to (A). So we don't
need to worry about A appearing twice. So our cost model is correct in
this case.

Regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bertrand Drouvot 2026-07-04 07:47:08 Fix races conditions in DropRole() and GrantRole()
Previous Message Bharath Rupireddy 2026-07-04 05:26:00 Re: Fix race condition in pg_get_publication_tables with concurrent DROP TABLE