Re: Should we update the random_page_cost default value?

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tomas Vondra <tomas(at)vondra(dot)me>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Sabino Mullane <htamfids(at)gmail(dot)com>, Robert Treat <rob(at)xzilla(dot)net>, Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Should we update the random_page_cost default value?
Date: 2025-10-08 20:52:17
Message-ID: sxs47dhg3z2viu5stxx3twjb4ui4xygd6vmnprrqlf52u34awr@47sgcanfl5ah
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2025-10-08 22:20:31 +0200, Tomas Vondra wrote:
> On 10/8/25 21:37, Andres Freund wrote:
> > On 2025-10-08 21:25:53 +0200, Tomas Vondra wrote:
> >> On 10/8/25 19:23, Robert Haas wrote:
> >>>> I think in the past we mostly assumed we can't track cache size per
> >>>> table, because we have no visibility into page cache. But maybe direct
> >>>> I/O would change this?
> >>>
> >>> I think it's probably going to work out really poorly to try to use
> >>> cache contents for planning. The plan may easily last much longer than
> >>> the cache contents.
> >>>
> >>
> >> Why wouldn't that trigger invalidations / replanning just like other
> >> types of stats? I imagine we'd regularly collect stats about what's
> >> cached, etc. and we'd invalidate stale plans just like after ANALYZE.
> >
> > You can't just handle it like other such stats - the contents of
> > shared_buffers can differ between primary and standby and other stats that
> > trigger replanning are all in system tables that can't differ between primary
> > and hot standby instances.
> >
> > We IIRC don't currently use shared memory stats for planning and thus have no
> > way to trigger invalidation for relevant changes. While it seems plausible to
> > drive this via shared memory stats, the current cumulative counters aren't
> > really suitable, we'd either need something that removes the influence of
> > olders hits/misses or a new field tracking the current number of buffers for a
> > relation [fork].
> >
>
> I don't think I mentioned pgstat (i.e. the shmem stats) anywhere, and I
> mentioned ANALYZE which has nothing to do with pgstats either. So I'm a
> bit confused why you argue we can't use pgstat.

I'm mentioning pgstats because we can't store stats like ANALYZE otherwise
does, due to that being in catalog tables. Given that, why wouldn't we store
the cache hit ratio in pgstats?

It'd be pretty weird to overload this into ANALYZE imo, given that this would
be the only stat that we'd populate on standbys in ANALYZE. We'd also need to
start running AV on standbys for it.

> What I imagined is more like a process that regularly walks shared
> buffers, counts buffers per relation (or relfilenode), stores the
> aggregated info into some shared memory (so that standby can have it's
> own concept of cache contents).

That shared memory datastructure basically describes pgstats, no?

> And then invalidates plans the same way ANALYZE does.

I'm not sure the invalidation machinery actually fully works in HS (due to
doing things like incrementing the command counter). It would probably be
doable to change that though.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2025-10-08 21:08:45 Re: Fix overflow of nbatch
Previous Message Nathan Bossart 2025-10-08 20:48:46 Re: Expanding HOT updates for expression and partial indexes