| From: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
|---|---|
| To: | assam258(at)gmail(dot)com |
| Cc: | vik(at)postgresfriends(dot)org, er(at)xs4all(dot)nl, jacob(dot)champion(at)enterprisedb(dot)com, david(dot)g(dot)johnston(at)gmail(dot)com, peter(at)eisentraut(dot)org, pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: Row pattern recognition |
| Date: | 2026-02-27 13:54:56 |
| Message-ID: | 20260227.225456.33226875991025537.ishii@postgresql.org |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi Henson,
> Hi Tatsuo,
>
> Currently we do not account the cost of RPR while planning. Attached
>> is the first attempt to try to estimate the RPR costs. The cost model
>> is very simple:
>>
>> expression cost per PATTERN variable * number of input tuples
>>
>> Any idea to make this estimation better?
>>
>
> > foreach(lc, windowFuncs)
>> {
>> ...
>> + /* also add DEFINE clause expressions' cost to per-input-row costs */
>> + if (winclause->rpPattern)
>> + {
>> + List *pattern_vars; /* list of pattern variable names */
>> + ListCell *lc2;
>> +
>> + pattern_vars = collectPatternVariables(winclause->rpPattern);
>> +
>> + /* iterate according to the pattern variable */
>> + foreach(lc2, pattern_vars)
>> + {
>> + char *ptname = strVal((char *) lfirst(lc2));
>
> `collectPatternVariables` returns a list of String nodes (via
> `makeString()`), so `strVal(lfirst(lc2))` is the idiomatic form.
> The `(char *)` cast is misleading.
Ok.
> There is also a correctness issue: DEFINE expressions belong to the
> window clause, not to individual window functions, so their cost
> should not be multiplied by the number of window functions sharing
> the clause.
You are right.
> The fix is to compute the DEFINE cost once outside the loop and add
> it to `startup_cost` and `total_cost` directly, after the
> `foreach(lc, windowFuncs)` block.
Looks good.
> Regarding the cost model: the NFA executor evaluates all DEFINE
> expressions once per row into a shared `nfaVarMatched[]` array that
> all active contexts read from, and contexts advance strictly forward
> so no prior row is ever re-evaluated. The one-evaluation-per-row
> cost model is therefore accurate. NFA-aware cost
> modeling could be built on top of this foundation in a separate patch
> down the road, once the NFA implementation has matured.
>
> For now, the DEFINE expression costs themselves already serve as a
> natural penalty ― a window clause with RPR will consistently appear
> more expensive than a comparable plain window function. This gives
> the surrounding plan a reasonable cost signal for decisions such as
> join ordering and materialization of RPR subqueries. So the current
> approach is reasonable as a first step.
>
> Other than that, the approach looks good to me. Would it be okay if
> I revise the patch along those lines?
Yes, no problem. Thanks!
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
| From | Date | Subject | |
|---|---|---|---|
| Next Message | John Naylor | 2026-02-27 14:00:37 | Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc? |
| Previous Message | Daniil Davydov | 2026-02-27 13:49:15 | Re: POC: Parallel processing of indexes in autovacuum |