Re: Non-deterministic buffer counts reported in execution with EXPLAIN ANALYZE BUFFERS

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Tomas Vondra <tomas(at)vondra(dot)me>
Cc: Radim Marek <radim(at)boringsql(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Non-deterministic buffer counts reported in execution with EXPLAIN ANALYZE BUFFERS
Date: 2026-02-05 00:15:45
Message-ID: CAApHDvp2n2ZWCpnQVTRSkheqfB_M48nZPzizfYNVtChy8z3KQQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 5 Feb 2026 at 04:59, Tomas Vondra <tomas(at)vondra(dot)me> wrote:
>
> On 2/3/26 22:53, David Rowley wrote:
> > I think generally, offloading more work into the planner for looking
> > up things that are always the same for every invocation of a plan is
> > generally good. For anything that moves things more in that direction,
> > you'd need to ensure that the plan is correctly invalidated when
> > something changes about the extra thing you looked up in planning.
> > I've not looked into the specifics of this instance, other than the
> > function being called which causes the buffer usage is
> > get_opfamily_proc(). I don't see that called during planning for this
> > query, so your statement about "already resolved opfamily from the
> > planner" doesn't seem true.
> >
>
> Yes, that's essentially what I explained yesterday.
>
> I'm not sure about moving all these lookups to the planner (from the
> executor). Could this easily hurt some cases? We'd need to do this for
> all fields the final plan might need, i.e. we'd need to initialize all
> fields for all paths we construct - not just the metadata needed by the
> final cheapest one. That could be a lot of additional lookups, right?
> That would make the first execution more expensive. Later executions of
> a query that would be cached, but we'd be still paying the cost for the
> lookup (cheap).

We do look some things up at createplan time. e.g calling
get_opfamily_member_for_cmptype() from prepare_sort_from_pathkeys().
However, that could still be an additional lookup as the existing
lookup for the case in question is in ExecSort rather than in
ExecInitSort(), so there could be a surplus lookup if the sort node is
never executed. I doubt that's worth worrying about too much. It's
normally large memory allocations we want to defer until fetching the
first row from the node.

> Of course, there's a lot of such fields - this amproc is just an
> example. I don't think it's practical to rework all of that to
> initialize everything early.
>
> > However, I imagine this is just one of many things and if you're
> > aiming to stabilise your tests doing this, then you'll likely be
> > fixing things for a long time to come. I imagine a better way if
> > you're just doing this for running test suites would be to invent some
> > way to prepopulate the sys and cat caches. Maybe that could be a
> > contrib module. I don't think there's a way to get a relcache miss
> > during execution, but I guess it could change with prepared statements
> > if we ever decided to start evicting long unused relcache entries one
> > day.
> >
>
> I'm not sure there's a good way to warmup the sys/cat caches, short of
> loading everything. Because how would you know what will be needed by
> the execution plan? It seems very complex.

I imagined if it's just for machines running tests then you could just
load everything. If it was coded in such a way that a tuple fetched by
doing a Seq Scan on the catalogue table was what went into the cache,
rather than the Seq Scan drives the normal cache lookup code,
resulting in a subsequent Index Scan on the catalogue's index, then it
could be done with fairly low overhead. I imagine in the order of
<10ms from fresh initdb. That doesn't seem excessively long for
machines running tests in the background.

> One option would be to run the queries twice - the first one would
> warmup caches, the second execution would be the measured one. But that
> makes the tool 2x as expensive.

Maybe. I don't know the tool or how people use it.

> But what if we had a way to *invalidate* all the caches? That'd also
> make the behavior deterministic - there would be no hits initially. And
> it seems much simpler to implement.

There is debug_discard_caches, but that requires an assert-enabled build.

David

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Sami Imseih 2026-02-05 00:21:06 Re: Fix pg_stat_get_backend_wait_event() for aux processes
Previous Message Michael Paquier 2026-02-05 00:03:29 Re: oid2name : add objects file path