| From: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
|---|---|
| To: | assam258(at)gmail(dot)com |
| Cc: | jian(dot)universality(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org, zsolt(dot)parragi(at)percona(dot)com, sjjang112233(at)gmail(dot)com, vik(at)postgresfriends(dot)org, er(at)xs4all(dot)nl, jacob(dot)champion(at)enterprisedb(dot)com, david(dot)g(dot)johnston(at)gmail(dot)com, peter(at)eisentraut(dot)org, li(dot)evan(dot)chao(at)gmail(dot)com |
| Subject: | Re: Row pattern recognition |
| Date: | 2026-07-04 07:28:41 |
| Message-ID: | 20260704.162841.39857602849942465.ishii@postgresql.org |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
> The change is a simplification that came out of Jian's v48 review, and it
> does not change the cost the model produces.
>
> It follows Jian's suggestion from 2026-06-15 [1]:
>
>> collectPatternVariables is not needed.
>> The parser already ensures every DEFINE variable appears in PATTERN,
>> so there is nothing to filter.
>> Also, we don't really do anything special (like make a dummy Const)
>> regarding PATTERN variables that not appearing in the DEFINE clause.
>
> The two loops charge exactly the same set, for the following reason:
>
> - The old loop walked the unique PATTERN variables (collectPatternVariables
> deduplicates, returning each name once) and, for each, looked up the
> matching DEFINE entry by resname and charged that DEFINE's cost.
> - Every DEFINE variable is guaranteed to appear in PATTERN -- the parser
> rejects a DEFINE variable that is not used in PATTERN (errmsg "DEFINE
> variable \"%s\" is not used in PATTERN" in parse_rpr.c). So the DEFINE
> clause is always a subset of the unique PATTERN variables, and each
> DEFINE resname is unique.
> - A PATTERN variable that has no DEFINE contributes nothing to the old
> loop, because the inner resname lookup finds no match.
>
> So the old loop already charged each DEFINE expression exactly once, and
> nothing else. Iterating defineClause directly, as v50-0006 does, visits
> precisely that same set once each. The estimate is unchanged; only the
> redundant outer walk over PATTERN and the per-variable resname lookup into
> the DEFINE clause are removed.
Ok, thanks for the explanation.
> This also matches the premise of the cost model we settled on back in
> February: the NFA executor evaluates every DEFINE expression once per row,
> so the natural unit for the per-tuple charge is the DEFINE variable.
BTW, I was thinking about cases where same DEFINE variable appears
twice or more in PATTERN for a same row. For example PATTERN
(A|A). But in this case it would be optimized out to (A). So we don't
need to worry about A appearing twice. So our cost model is correct in
this case.
Regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Bertrand Drouvot | 2026-07-04 07:47:08 | Fix races conditions in DropRole() and GrantRole() |
| Previous Message | Bharath Rupireddy | 2026-07-04 05:26:00 | Re: Fix race condition in pg_get_publication_tables with concurrent DROP TABLE |