Re: Extension Enhancement: Buffer Invalidation in pg_buffercache

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Maxim Orlov <orlovmg(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, Palak Chaturvedi <chaturvedipalak1911(at)gmail(dot)com>, Jim Nasby <jim(dot)nasby(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Nitin Jadhav <nitinjadhavpostgres(at)gmail(dot)com>
Subject: Re: Extension Enhancement: Buffer Invalidation in pg_buffercache
Date: 2024-04-04 03:22:17
Message-ID: CA+hUKGKGyVjqoaSC7AZrgYKHAk9Nid9G4Ao3ZtL_OtLZDb+_cA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 8, 2024 at 6:20 AM Maxim Orlov <orlovmg(at)gmail(dot)com> wrote:
> Quite an interesting patch, in my opinion. I've decided to work on it a bit, did some refactoring (sorry) and add
> basic tests. Also, I try to take into account as much as possible notes on the patch, mentioned by Cédric Villemain.

Thanks! Unfortunately I don't think it's possible to include a
regression test that looks at the output, because it'd be
non-deterministic. Any other backend could pin or dirty the buffer
you try to evict, changing the behaviour.

> > and maybe better to go with FlushOneBuffer() ?
> It's a good idea, but I'm not sure at the moment. I'll try to dig some deeper into it. At least, FlushOneBuffer does
> not work for a local buffers. So, we have to decide whatever pg_buffercache_invalidate should or should not
> work for local buffers. For now, I don't see why it should not. Maybe I miss something?

I think it's OK to ignore local buffers for now. pg_buffercache
generally doesn't support/show them so I don't feel inclined to
support them for this. I removed a few traces of local support.

It didn't seem appropriate to use the pg_monitor role for this, so I
made it superuser-only. I don't think it makes much sense to use this
on any kind of production system so I don't think we need a new role
for it, and existing roles don't seem too appropriate. pageinspect et
al use the same approach.

I added a VOLATILE qualifier to the function.

I added some documentation.

I changed the name to pg_buffercache_evict().

I got rid of the 'force' flag which was used to say 'I only want to
evict this buffer it is clean'. I don't really see the point in that,
we might as well keep it simple. You could filter buffers on
"isdirty" if you want.

I added comments to scare anyone off using EvictBuffer() for anything
much, and marking it as something for developer convenience. (I am
aware of an experimental patch that uses this same function as part of
a buffer pool resizing operation, but that has other infrastructure to
make that safe and would adjust those remarks accordingly.)

I wondered whether it should really be testing for BM_TAG_VALID
rather than BM_VALID. Arguably, but it doesn't seem important for
now. The distinction would arise if someone had tried to read in a
buffer, got an I/O error and abandoned ship, leaving a buffer with a
valid tag but not valid contents. Anyone who tries to ReadBuffer() it
will then try to read it again, but in the meantime this function
won't be able to evict it (it'll just return false). Doesn't seem
that obvious to me that this obscure case needs to be handled. That
doesn't happen *during* a non-error case, because then it's pinned and
we already return false in this code for pins.

I contemplated whether InvalidateBuffer() or InvalidateVictimBuffer()
would be better here and realised that Andres's intuition was probably
right when he suggested the latter up-thread. It is designed with the
right sort of arbitrary concurrent activity in mind, where the former
assumes things about locking and dropping, which could get us into
trouble if not now maybe in the future.

I ran the following diabolical buffer blaster loop while repeatedly
running installcheck:

do
$$
begin
loop
perform pg_buffercache_evict(bufferid)
from pg_buffercache
where random() <= 0.25;
end loop;
End;
$$;

The only ill-effect was a hot laptop.

Thoughts, objections, etc?

Very simple example of use:

create or replace function uncache_relation(name text)
returns boolean
begin atomic;
select bool_and(pg_buffercache_evict(bufferid))
from pg_buffercache
where reldatabase = (select oid
from pg_database
where datname = current_database())
and relfilenode = pg_relation_filenode(name);
end;

More interesting for those of us hacking on streaming I/O stuff was
the ability to evict just parts of things and see how the I/O merging
and I/O depth react.

Attachment Content-Type Size
v5-0001-Add-pg_buffercache_evict-function-for-testing.patch application/octet-stream 9.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2024-04-04 03:28:58 Re: Popcount optimization using AVX512
Previous Message Michał Kłeczek 2024-04-04 03:20:59 Re: Is it safe to cache data by GiST consistent function