Re: [Patch] Optimize dropping of relation buffers using dlist

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, "k(dot)jamison(at)fujitsu(dot)com" <k(dot)jamison(at)fujitsu(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [Patch] Optimize dropping of relation buffers using dlist
Date: 2020-08-07 16:26:49
Message-ID: CA+TgmobrvXOq1E96igua-zzw1TraqHbHOfkb+469ynYtsH__pQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Aug 7, 2020 at 12:09 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> As for (1), maybe we don't need to keep the info in shmem. I'll just
> point out that the checkpointer has *already got* a complete list of all
> recently-dropped relations, cf pendingUnlinks in sync.c. So you could
> imagine looking aside at that to discover that a dirty buffer belongs to a
> recently-dropped relation. pendingUnlinks would need to be converted to a
> hashtable to make searches cheap, and it's not very clear what to do in
> backends that haven't got access to that table, but maybe we could just
> accept that backends that are forced to flush dirty buffers might do some
> useless writes in such cases.

I don't see how that can work. It's not that the writes are useless;
it's that they will fail outright because the file doesn't exist.

> As for (2), the reason why we have that list is that the physical unlink
> doesn't happen till after the next checkpoint. So with some messing
> around here, we could probably guarantee that every buffer belonging
> to the relation has been scanned and deleted before the file unlink
> happens --- and then, even if the OID counter has wrapped around, the
> OID won't be reassigned to a new relation before that happens.

This seems right to me, though.

> In short, it seems like maybe we could shove the responsibility for
> cleaning up dropped relations' buffers onto the checkpointer without
> too much added cost. A possible problem with this is that recycling
> of those buffers will happen much more slowly than it does today,
> but maybe that's okay?

I suspect it's going to be advantageous to try to make cleaning up
dropped buffers quick in normal cases and allow it to fall behind only
when someone is dropping a lot of relations in quick succession, so
that buffer eviction remains cheap in normal cases. I hadn't thought
about the possible negative performance consequences of failing to put
buffers on the free list, but that's another reason to try to make it
fast.

My viewpoint on this is - I have yet to see anybody really get hosed
because they drop one relation and that causes a full scan of
shared_buffers. I mean, it's slightly expensive, but computers are
fast. Whatever. What hoses people is dropping a lot of relations in
quick succession, either by spamming DROP TABLE commands or by running
something like DROP SCHEMA, and then suddenly they're scanning
shared_buffers over and over again, and their standby is doing the
same thing, and now it hurts. The problem on the standby is actually
worse than the problem on the primary, because the primary can do
other things while one process sits there and thinks about
shared_buffers for a long time, but the standby can't, because the
startup process is all there is.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2020-08-07 16:27:01 Re: WIP: BRIN multi-range indexes
Previous Message Robert Haas 2020-08-07 16:13:55 Re: Parallel worker hangs while handling errors.