Re: EXPLAIN: showing ReadStream / prefetch stats

From: Tomas Vondra <tomas(at)vondra(dot)me>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Lukas Fittl <lukas(at)fittl(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: EXPLAIN: showing ReadStream / prefetch stats
Date: 2026-04-06 20:39:08
Message-ID: 2aaa701f-76f9-4e83-bb2c-ec4520b5a89b@vondra.me
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 4/6/26 18:50, Melanie Plageman wrote:
> On Mon, Apr 6, 2026 at 10:01 AM Melanie Plageman
> <melanieplageman(at)gmail(dot)com> wrote:
>>
>> On Mon, Apr 6, 2026 at 5:15 AM Tomas Vondra <tomas(at)vondra(dot)me> wrote:
>>>
>>> Anyway, that doesn't matter much, because the more I look at the
>>> approach with having a separate chunk of shared memory, the more I like
>>> it. It seems much simpler, more elegant, etc. I really disliked how
>>> unreadable the code got with the parallel_aware/instrument checks in
>>> multiple places, and this just gets rid of that. I like that.
>>>
>> <--snip-->
>>>
>>> Is execParallel.h the right place to define the offset? It means the
>>> various nodes (like nodeBitmapHeapScan) now have to include this header,
>>> and it seems a bit suspicious. I can't think of a better .h file, and
>>> maybe I'm wrong and it's perfectly fine.
>>
>> It could go in src/include/executor is instrument_node.h. It's where
>> the structs for the shared instrumentation go. It's probably cheaper
>> to include also because it doesn't include anything.
>
> I cleaned up the first patch in the set that refactors index-only and
> index scan to use this pattern and realized that I wasn't sure what to
> do about the duplication between index-only and index scans for these
> functions.
>

Thanks!

> ExecIndexScanInstrumentEstimate() and
> ExecIndexOnlyScanInstrumentEstimate() are the same code except they
> take different node types to do the check for intsrumentation
> if (!node->ss.ps.instrument || pcxt->nworkers == 0)
> return;
> The size estimation for SharedIndexScanIntrumentation could be common
> to the two but it is only three lines and doesn't seem worth another
> function call. ExecIndex[Only]ScanEstimate() use a common helper
> index_parallelscan_estimate() but that is more code.
Seems like a reasonable trade off to me. It's way more readable without
the conditions, even if there's a bit more code than before.

To me this seems fine.

> We could just have a common helper and only call it if
> node->ss.ps.instrument && pcxt->nworkers > 0 but I don't see anybody
> else in ExecParallelEstimate() checking for instrumentation being
> enabled.
>

Not sure I understand what helper you mean? In execParallel.c?

AFAICS now we have 3 scans that check this (index, index-only, BHS).
With the other patches there will be SeqScan and TidRangeScan too.

> Besides that, the function names are very long already but don't
> include the word parallel (though ExecIndexScanEstimate() is for
> parallel scans only and doesn't include "parallel").
>
> ExecIndex[Only]ScanInstrumentInitDSM() is the same between the two
> except the node type it takes but three different members of node are
> used here, so I'm not sure how much sense a common helper makes here.
>
> I have a feeling the above stuff isn't okay how it is now, but I'm not
> sure which direction to go.
>
> I think ExecIndex[Only]ScanInstrumentInitWorker() is fine the way it
> is because it is only 1 line and requires a different node for
> index/index-only.
>

I think it's OK. Maybe there is a scheme that makes this work both for
plain index scans and index-only scans, but I doubt it's much simpler in
the end. Doesn't seem worth it to me.

I think this is ready to go, with a tiny amount of polishing.

1) "amount of DSM" sounds a bit strange to me. The wording "amount of
space in ..." from the other nodes seems better to me. Or maybe I'm just
used to it, not sure.

2) I wonder if maybe PARALLEL_KEY_SCAN_INSTRUMENT_OFFSET should be
placed in plannodes.h, because that's where Plan->plan_node_id is
defined. instrument_node.h works too, but the places accessing the
plan_node_id already have to include plannodes.h.

Other than this, I think it's ready to go. I went over it multiple
times, but I don't have any new comments.

We need to decide whether to push this into PG19. This was primarily
motivated by the index prefetching work, but we now know that won't
happen in PG19 :-( But the instrumentation is useful even for the three
scans using read streams, so I think it's a meaningful improvement.

If you think you can get this pushed, I'll do my best to finalize the
instrumentation and SeqScan/TidRangeScan parts.

regards

--
Tomas Vondra

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2026-04-06 21:11:30 Re: Adding REPACK [concurrently]
Previous Message Melanie Plageman 2026-04-06 20:33:16 Re: EXPLAIN: showing ReadStream / prefetch stats