Re: xid wraparound danger due to INDEX_CLEANUP false

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: xid wraparound danger due to INDEX_CLEANUP false
Date: 2020-06-26 00:18:22
Message-ID: CAH2-Wzm2WM9K6eZ3g+ccVMgWyd8VhiOtwLYJVQkKp+BX4Wn8dA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jun 25, 2020 at 6:59 AM Masahiko Sawada
<masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
> I think that with the approach implemented in my patch, it could be a
> problem for the user that the user cannot easily know in advance
> whether vacuum with INDEX_CLEANUP false will perform index cleanup,
> even if page deletion doesn’t happen in most cases.

I was unclear. I agree that the VACUUM command with "INDEX_CLEANUP =
off" is an emergency mechanism that should be fully respected, even
when that means that we'll leak deleted pages.

Perhaps it would make sense to behave differently when the index is on
a table that has "vacuum_index_cleanup = off" set, and the vacuum is
started by autovacuum, and is not an anti-wraparound vacuum. That
doesn't seem all that appealing now that I write it down, though,
because it's a non-obvious behavioral difference among cases that
users probably expect to behave similarly. On the other hand, what
user knows that there is something called an aggressive vacuum, which
isn't exactly the same thing as an anti-wraparound vacuum?

I find it hard to decide what the least-worst thing is for the
backbranches. What do you think?

> I don’t come up with a good solution to keep us safe against XID
> wraparound yet but it seems to me that it’s better to have an option
> that forces index cleanup not to happen.

I don't think that there is a good solution that is suitable for
backpatching. The real solution is to redesign the recycling along the
lines I described.

I don't think that it's terrible that we can leak deleted pages,
especially considering the way that users are expected to use the
INDEX_CLEANUP feature. I would like to be sure that the problem is
well understood, though -- we should at least have a plan for Postgres
v14.

> I thought that btbulkdelete and/or btvacuumcleanup can register an
> autovacuum work item to recycle the page that gets deleted but it
> might not able to recycle those pages enough because the autovacuum
> work items could be taken just after vacuum. And if page deletion is
> relatively a rare case in practice, we might be able to take an
> optimistic approach that vacuum registers deleted pages to FSM on the
> deletion and a process who takes a free page checks if the page is
> really recyclable. Anyway, I’ll try to think more about this.

Right -- just putting the pages in the FSM immediately, and making it
a problem that we deal with within _bt_getbuf() is an alternative
approach that might be better.

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2020-06-26 00:22:32 Re: Avoiding hash join batch explosions with extreme skew and weird stats
Previous Message Tomas Vondra 2020-06-25 23:53:57 Re: Default setting for enable_hashagg_disk