Re: EXPLAIN: showing ReadStream / prefetch stats

From: Lukas Fittl <lukas(at)fittl(dot)com>
To: Tomas Vondra <tomas(at)vondra(dot)me>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: EXPLAIN: showing ReadStream / prefetch stats
Date: 2026-03-19 18:11:16
Message-ID: CAP53Pkzd9P=k_DsxU4rGoCEO-26rr9NQcS8co2y4HXB7TyiMbA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 18, 2026 at 3:41 PM Tomas Vondra <tomas(at)vondra(dot)me> wrote:
> The 0003 also changes the EXPLAIN to enable IO by default, just like we
> do for BUFFERS. It seems like a reasonable precedent to me.

One side effect of that is that the tests now fail for me locally,
because the specific values are system-dependent. Attached a patch
(nocfbot-0002) that fixed that for me.

There is one detail maybe calling out specifically on JSON output:
Currently Postgres always emits all fields in JSON output, even if
they are zero. The code that you have in v3 skips the "I/O" group when
the value is zero, which doesn't work well with how current regression
tests are written. I'm definitely not a fan of the unnecessary
verbosity of JSON EXPLAIN output, but I'd suggest we don't break with
the tradition here, and instead always output the "I/O" group in
non-text formats. Also attached a patch for that (nocfbot-0001).

Overall I think the abstraction here seems reasonable if we're
primarily focused on getting the per-node instrumentation taken care
of.

That said, two thoughts on an example EXPLAIN output I just ran:

1) I do wonder if its a bit confusing that we propagate I/O timings up
the EXPLAIN tree, but not the "I/O" information - I realize fixing
that would be a bit involved though, e.g. we'd have to invent
accumulation logic in explain.c. It'd also maybe make people thing
this covers things like temporary file reads/etc.

2) The ordering of "I/O Timings" in relation to "I/O" feels off to me
(since they're not next to each other) - maybe we should re-order I/O
Timings to come before Buffers in show_buffer_usage to address that?

EXPLAIN (ANALYZE) SELECT COUNT(*) FROM t;

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=218331.00..218331.01 rows=1 width=8) (actual
time=563.437..563.437 rows=1.00 loops=1)
Buffers: shared hit=15806 read=41274
I/O Timings: shared read=1.180
-> Seq Scan on t (cost=0.00..186080.80 rows=12900080 width=0)
(actual time=0.335..306.737 rows=12900005.00 loops=1)
Prefetch: avg=61.517 max=91 capacity=94
I/O: stalls=7 size=14.825 inprogress=5.321
Buffers: shared hit=15806 read=41274
I/O Timings: shared read=1.180
Planning Time: 0.101 ms
Execution Time: 563.471 ms
(10 rows)

Thanks,
Lukas

--
Lukas Fittl

Attachment Content-Type Size
nocfbot-0001-Emit-I-O-group-always-for-non-text-formats.patch application/octet-stream 1.6 KB
nocfbot-0002-Fix-regression-test-failures-due-to-defaulting-to-IO.patch application/octet-stream 108.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2026-03-19 18:26:28 Re: [PATCH] Fix fd leak in pg_dump compression backends when dup()+fdopen() fails
Previous Message Jianghua Yang 2026-03-19 18:06:17 [PATCH] Fix build failure on macOS 26.2 SDK due to missing nl_langinfo_l declaration