Re: Add os_page_num to pg_buffercache

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
Cc: Mircea Cadariu <cadariu(dot)mircea(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Add os_page_num to pg_buffercache
Date: 2025-11-18 05:39:36
Message-ID: aRwGmO8JSS2v1TYU@paquier.xyz
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Aug 22, 2025 at 08:48:57AM +0000, Bertrand Drouvot wrote:
> Yeah, I think so. Thanks for the ping, done in attached.

The patch has been marked as ready for committer, and I see the value
in providing a view where it is possible to know to which OS page a
given shared buffer is linked to, based on the OS page size and our
shared buffer size. No problem with that.

Now, I am not really a fan of the duplication created here, where most
of the code pg_buffercache_os_pages() is a plain copy-paste of
pg_buffercache_numa_pages(), with the difference being that we want
only os_page_status via a call to pg_numa_query_pages(), something
that we can rely on when pg_numa_init() is able to work. The
copy-paste is not complete, actually, we surely care about the
Assert() done after calling pg_get_shmem_pagesize(), that acts as a
sanity check to make sure that the buffer size and the OS page size
are divisible pieces (one being divisible by the other), and there is
more that would be nice to not duplicate, like the start pointer
location, the comment on top of pg_get_shmem_pagesize(), etc.

It seems to me that it would be simpler to make the allocations and
information of os_page_ptrs and os_page_status conditional depending
on the result of pg_numa_init(), and that we could just fill numa_node
with NULLs if numa is not available in the environment where the query
is run. This includes making pg_numa_touch_mem_if_required()
conditional, of course, not called when pg_numa_init() fails. Or is
there a strong reason where it would be better to rely on an
elog(ERROR) if numa(3) fails, based on numa_available()? The purpose
of these views being monitoring, it is usually easier in my experience
to rely on NULLs rather than facing periodic errors when we don't know
something. That makes JOINs more predictible, for one.

Please note that I don't mind the extra view pg_buffercache_os_pages
that can provide some information that's transparent cross-platform,
but let's make it something that calls pg_buffercache_numa_pages()
instead.

Note 1: the patch failed to compile as we don't need a buffer state
anymore when unlocking a buffer.

Note 2: Nice catch about the description of os_page_num, applied this
one separately.
--
Michael

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Vaibhav Dalvi 2025-11-18 05:40:37 Re: [PATCH] Add pg_get_subscription_ddl() function
Previous Message Michael Paquier 2025-11-18 05:39:04 Re: Extended Statistics set/restore/clear functions.