| From: | Tomas Vondra <tomas(at)vondra(dot)me> |
|---|---|
| To: | Melanie Plageman <melanieplageman(at)gmail(dot)com>, Lukas Fittl <lukas(at)fittl(dot)com> |
| Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de> |
| Subject: | Re: EXPLAIN: showing ReadStream / prefetch stats |
| Date: | 2026-03-17 00:15:23 |
| Message-ID: | 41d87d75-d126-438b-a02e-6671142d39b7@vondra.me |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On 3/16/26 19:26, Tomas Vondra wrote:
> On 3/16/26 18:29, Melanie Plageman wrote:
>> On Mon, Mar 16, 2026 at 3:08 AM Lukas Fittl <lukas(at)fittl(dot)com> wrote:
> ...
>
>>> I feel like something is off about the complexity of having each node
>>> type ferry back the information. e.g. when you're implementing the
>>> support for index prefetching, it'll require a bunch more changes. In
>>> my mind, there is a reason we have a related problem that we solved
>>> with the current pgBufferUsage, instead of dealing with that on a
>>> per-node basis. I really feel we should have a more generic way of
>>> dealing with this.
>> <--snip-->
>>> I've attached a prototype of how that could look like (apply the other
>>> patch set first, v8, see commit fest entry [1] - also attached a
>>> preparatory refactoring of using "Instrumentation" for parallel query
>>> reporting, which avoids having individual structs there).
>>
It looks reasonable, and it's nice to not have to worry about passing
the data through the TAM interface etc. Having to do stuff like this:
- if (instr->instr.need_bufusage || instr->instr.need_walusage)
+ if (instr->instr.need_bufusage || instr->instr.need_walusage ||
instr->instr.need_iousage)
in so many places is not great. It'd really benefit from some macro that
says "if any of the pieces is needed ...".
>> The patch footprint is _much_ nicer with your stack-based
>> instrumentation. Very cool. I'll leave it to Tomas whether he wants to
>> create a dependency on your big project a few weeks before feature
>> freeze, though.
>>
>
> I don't know. I have not looked at the stack-based instrumentation yet,
> so I have no idea how complex or how close to committable it is.
>
I've started looking at the stack tracking patch only today, and it
looks in pretty good shape. I have no clear idea how close to being
committed it is, so I'm hesitant to make this patch depend on it.
AFAICS the stack-based tracking is meant for resources with just one
instance. E.g. there's only one WAL, or one buffer pool, so when a node
says "update WAL usage" or "increment buffer hit/read", there's one
counter to update. But will that necessarily apply for read streams?
AFAIK existing scans use only a single read stream, but that's only
because none of the built-in index AMs uses a read stream. If we get
index prefetching in PG19, or if any index AM (e.g. pgvector) chooses to
use a read stream internally, how will that work? Will we "merge" the
stats from all read streams used by the node, or what will happen?
I realize this is a similar issue the 0007 part of the stack-based
tracing patch deals with - tracking buffers for index/table separately.
And it does that by making nodeIndexscan.c quite a bit more complex by
expanding index_getnext_slot(). Doesn't seem great, but not sure. I
suppose nodeIndexonlyscan.c will have to do something similar.
regards
--
Tomas Vondra
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Jacob Champion | 2026-03-17 00:24:58 | Re: Improve OAuth discovery logging |
| Previous Message | Peter Smith | 2026-03-16 23:58:33 | Re: Skipping schema changes in publication |