Re: Extension Enhancement: Buffer Invalidation in pg_buffercache

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Palak Chaturvedi <chaturvedipalak1911(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Extension Enhancement: Buffer Invalidation in pg_buffercache
Date: 2023-06-30 22:09:12
Message-ID: CA+hUKG+i3Dnsv_M+Opk6tTv96w8ZfWFPQAYScj_LRtvw=qx0+A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 30, 2023 at 10:47 PM Palak Chaturvedi
<chaturvedipalak1911(at)gmail(dot)com> wrote:
> pgbench=# select count(pg_buffercache_invalidate(bufferid)) from
> pg_buffercache where relfilenode =
> pg_relation_filenode('pgbench_accounts'::regclass);

Hi Palak,

Thanks for working on this! I think this will be very useful for
testing existing workloads but also for testing future work on
prefetching with AIO (and DIO), work on putting SLRUs (or anything
else) into the buffer pool, nearby proposals for caching buffer
mapping information, etc etc.

Palak and I talked about this idea a bit last week (stimulated by a
recent thread[1], but the topic has certainly come up before), and we
discussed some different ways one could specify which pages are
dropped. For example, perhaps the pg_prewarm extension could have an
'unwarm' option instead. I personally thought the buffer ID-based
approach was quite good because it's extremely simple, while giving
the user the full power of SQL to say which buffers. Half a table?
Visibility map? Everything? Root page of an index? I think that's
probably better than something that requires more code and
complication but is less flexible in the end. It feels like the right
level of rawness for something primarily of interest to hackers and
advanced users. I don't think it matters that there is a window
between selecting a buffer ID and invalidating it, for the intended
use cases. That's my vote, anyway, let's see if others have other
ideas...

We also talked a bit about how one might control the kernel page cache
in more fine-grained ways for testing purposes, but it seems like the
pgfincore project has that covered with its pgfadvise_willneed() and
pgfadvise_dontneed(). IMHO that project could use more page-oriented
operations (instead of just counts and coarse grains operations) but
that's something that could be material for patches to send to the
extension maintainers. This work, in contrast, is more tangled up
with bufmgr.c internals, so it feels like this feature belongs in a
core contrib module.

Some initial thoughts on the patch:

I wonder if we should include a simple exercise in
contrib/pg_buffercache/sql/pg_buffercache.sql. One problem is that
it's not guaranteed to succeed in general. It doesn't wait for pins
to go away, and it doesn't retry cleaning dirty buffers after one
attempt, it just returns false, which I think is probably the right
approach, but it makes the behaviour too non-deterministic for simple
tests. Perhaps it's enough to include an exercise where we call it a
few times to hit a couple of cases, but not verify what effect it has.

It should be restricted by role, but I wonder which role it should be.
Testing for superuser is now out of fashion.

Where the Makefile mentions 1.4--1.5.sql, the meson.build file needs
to do the same. That's because PostgreSQL is currently in transition
from autoconf/gmake to meson/ninja[2], so for now we have to maintain
both build systems. That's why it fails to build in some CI tasks[3].
You can enable CI in your own GitHub account if you want to run test
builds on several operating systems, see [4] for info.

[1] https://www.postgresql.org/message-id/flat/CAFSGpE3y_oMK1uHhcHxGxBxs%2BKrjMMdGrE%2B6HHOu0vttVET0UQ%40mail.gmail.com
[2] https://wiki.postgresql.org/wiki/Meson
[3] http://cfbot.cputube.org/palak-chaturvedi.html
[4] https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob_plain;f=src/tools/ci/README;hb=HEAD

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2023-06-30 22:13:26 Re: Should we remove db_user_namespace?
Previous Message Andres Freund 2023-06-30 22:05:18 Re: Initdb-time block size specification