Re: RFC: Allow EXPLAIN to Output Page Fault Information

From: Atsushi Torikoshi <torikoshia(dot)tech(at)gmail(dot)com>
To: Ilmar Y <tanswis42(at)gmail(dot)com>, Lukas Fittl <lukas(at)fittl(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, Atsushi Torikoshi <torikoshia(at)oss(dot)nttdata(dot)com>
Subject: Re: RFC: Allow EXPLAIN to Output Page Fault Information
Date: 2026-06-01 13:10:41
Message-ID: CAM6-o=CATBQpDQAa3akuKqw9gWfpN-N7nVr9=8qSejKV+QyHXQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, May 26, 2026 at 3:44 AM Lukas Fittl <lukas(at)fittl(dot)com> wrote:

Thanks for your comment!

> Maybe we should try to figure out what would be needed to do better
> I/O tracking on the Linux side in a way that is compatible with I/O
> workers?

That may well be true.
Unfortunately, I do not currently have a good idea of how to achieve
this in a way that works with I/O workers.
I'll keep thinking about whether there might be another approach.

> e.g. I assume rusage is too expensive to run on individual I/Os that
> the workers process (so its not just a communication problem) -- but
> would be good to benchmark.

Yes, I suspect that would be the case.
Although this is somewhat different from the current discussion, I
once experimented with an implementation that called getrusage() for
each plan node, and the overhead was high enough that it was not
practically usable. [1]

I agree that many users will choose, or be required to choose,
io_method = worker. However, there are also users who will continue
to use sync or choose io_uring, so I do not think providing I/O
statistics for those users would be without value.

In particular, from a performance perspective, io_uring seems
likely to be the preferred option in the long run.
In high-performance on-premises environments, for example, it would
not be surprising to see users selecting io_uring.
Those users would also be among those most interested in
understanding how much I/O is being generated against storage,
which is the motivation behind this proposal.

On Sat, May 30, 2026 at 6:45 PM Ilmar Y <tanswis42(at)gmail(dot)com> wrote:

Thanks for your review!

> The patch applies cleanly on current master at
> db5ed03217b9c238703df8b4b286115d6e940488, but git am warned about trailing
> whitespace. git diff --check origin/master...HEAD reports:

> src/test/modules/test_misc/t/011_explain_storage_io.pl:47: trailing whitespace
> src/test/modules/test_misc/t/011_explain_storage_io.pl:55: trailing whitespace
> src/test/modules/test_misc/t/011_explain_storage_io.pl:61: trailing whitespace

I'll fix it.

> A second thing I noticed is that, with io_method=sync, structured EXPLAIN
> output can show an Execution Storage I/O section even when ANALYZE is not
> used.

We are currently considering moving this functionality to a new IO option that,
unlike BUFFERS, would require ANALYZE to be specified.[2]

[1] https://www.postgresql.org/message-id/CAGXjcj%3DhnEZCCDDMRv06EQnDcv0mRe6%2BPh1gv8%3DHb7NEc51y5A%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAM6-o%3DBL59LgruKXm3DCjzFzvr7TTJPtryEPj51YeGMFrWO0EQ%40mail.gmail.com

---
Regards,

Atsushi Torikoshi

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Chao Li 2026-06-01 13:12:17 pg_createsubscriber: allow duplicate publication names
Previous Message Alexandre Felipe 2026-06-01 13:07:47 Re: SLOPE - Planner optimizations on monotonic expressions.