Re: EXPLAIN: showing ReadStream / prefetch stats

From: Tomas Vondra <tomas(at)vondra(dot)me>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Lukas Fittl <lukas(at)fittl(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: EXPLAIN: showing ReadStream / prefetch stats
Date: 2026-04-07 01:14:49
Message-ID: 8d6187c9-62ac-43a3-88d1-7503c4206657@vondra.me
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 4/7/26 02:19, Melanie Plageman wrote:
> On Mon, Apr 6, 2026 at 4:39 PM Tomas Vondra <tomas(at)vondra(dot)me> wrote:
>>
>> On 4/6/26 18:50, Melanie Plageman wrote:
>>
>>> I cleaned up the first patch in the set that refactors index-only and
>>> index scan to use this pattern and realized that I wasn't sure what to
>>> do about the duplication between index-only and index scans for these
>>> functions.
>>
> <--snip-->
>>
>> I think this is ready to go, with a tiny amount of polishing.
>>
>> 1) "amount of DSM" sounds a bit strange to me. The wording "amount of
>> space in ..." from the other nodes seems better to me. Or maybe I'm just
>> used to it, not sure.
>
> Fixed that before pushing.
>
>> 2) I wonder if maybe PARALLEL_KEY_SCAN_INSTRUMENT_OFFSET should be
>> placed in plannodes.h, because that's where Plan->plan_node_id is
>> defined. instrument_node.h works too, but the places accessing the
>> plan_node_id already have to include plannodes.h.
>
> I don't think it really belongs there. It is very specific to
> execution. We use the plan node id as the key for convenience, but it
> isn't the purpose of plan_node_id.
> PARALLEL_KEY_SCAN_INSTRUMENT_OFFSET's only purpose is to be used
> during execution for instrumentation, so it feels like it belongs in
> node_instrument.h. And we don't need a separate include for it either.
> It goes with the other stuff being defined in instrument_node.h (like
> SharedIndexScanInstrumentation) and being used by those callers. I
> admit the comment above it is a bit odd, but I think it is ultimately
> okay.
>

WFM

>> We need to decide whether to push this into PG19. This was primarily
>> motivated by the index prefetching work, but we now know that won't
>> happen in PG19 :-( But the instrumentation is useful even for the three
>> scans using read streams, so I think it's a meaningful improvement.
>
> I think it is a meaningful improvement too. I think we should do it.

OK

>> If you think you can get this pushed, I'll do my best to finalize the
>> instrumentation and SeqScan/TidRangeScan parts.
>
> I've pushed the first patch for index/index-only scans. Attached is
> the BHS fix that uses the new pattern. It needs at least one review
> before pushing. While I was polishing it, I realized I neglected to
> use add_size()/mul_size() in the index-only/index scan patches. So,
> 0002 is just a fix commit to do that. Feel free to push these if you
> think they're ready. Otherwise, I'll do so pending your review in my
> morning.
>

Thanks!

I'll take a look in my morning, and will consider pushing the changes so
that I can start pushing the parts adding the new instrumentation.

I've spent a fair amount of time looking at those parts today. Attached
is v12 (including the two parts you posted as v11), with two or three
bigger changes compared to earlier versions:

1) default OFF

After thinking about it a bit more, I decided to change the default to
OFF. On the one hand I agree it's somewhat similar to BUFFERS, and that
option is ON by default now. But on the other hand, we must not clutter
EXPLAIN output with too much information, and I'm not convinced IO is
worth it. So I changed to OFF by default. That also makes the patches
smaller, due to not having to adjust that many tests.

2) auto_explain

It also occurred to me we should add a matching option to auto_explain,
similarly to log_buffers. 0005 does that.

3) INSTRUMENT_IO

The auto_explain bit also implies we need a new INSTRUMENT_IO constant,
to handle this just like BUFFERS. Which also means we can actually
collect the stats only when the IO option is specified (instead of for
all ANALYZE runs). Which is nice.

I plan to do some more testing in my morning, add missing comments,
polish the commit messages etc. And then eventually push it.

regards

--
Tomas Vondra

Attachment Content-Type Size
v12-0001-Fix-BitmapHeapScan-non-parallel-aware-EXPLAIN-AN.patch text/x-patch 8.8 KB
v12-0002-Use-add_size-mul_size-for-index-instrumentation-.patch text/x-patch 3.1 KB
v12-0003-switch-explain-to-unaligned-for-json-xml-yaml.patch text/x-patch 12.9 KB
v12-0004-Add-EXPLAIN-IO-infrastructure-and-BitmapHeapScan.patch text/x-patch 19.8 KB
v12-0005-auto_explain.patch text/x-patch 3.3 KB
v12-0006-Add-EXPLAIN-IO-instrumentation-for-SeqScan.patch text/x-patch 14.0 KB
v12-0007-Add-EXPLAIN-IO-instrumentation-for-TidRangeScan.patch text/x-patch 11.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Kupershmidt 2026-04-07 01:42:17 Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements
Previous Message Noah Misch 2026-04-07 01:10:56 Re: Adding REPACK [concurrently]