Re: 64-bit XIDs in deleted nbtree pages

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: 64-bit XIDs in deleted nbtree pages
Date: 2021-03-24 03:09:59
Message-ID: CAH2-WzkbDYsbmb+nu3PSQXTByqTZ=89cw0sKj3a799NYZ=szzQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 23, 2021 at 8:14 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> By this patch series, btree indexes became like hash indexes in terms
> of amvacuumcleanup. We do an index scan at btvacuumcleanup() in the
> two cases: metapage upgrading and more than 5%
> deleted-but-not-yet-recycled pages. Both cases seem rare cases. So do
> we want to disable parallel index cleanup for btree indexes like hash
> indexes? That is, remove VACUUM_OPTION_PARALLEL_COND_CLEANUP from
> amparallelvacuumoptions.

My recent "Recycle nbtree pages deleted during same VACUUM" commit
improved the efficiency of recycling, but I still think that it was a
bit of a hack. Or at least it didn't go far enough in fixing the old
design, which is itself a bit of a hack.

As I said back on February 15, a truly good design for nbtree page
deletion + recycling would have crash safety built in. If page
deletion itself is crash safe, it really makes sense to make
everything crash safe (especially because we're managing large chunks
of equisized free space, unlike in heapam). And as I also said back
then, a 100% crash-safe design could naturally shift the problem of
nbtree page recycle safety from the producer/VACUUM side, to the
consumer/_bt_getbuf() side. It should be completely separated from
when VACUUM runs, and what VACUUM can discover about recycle safety in
passing, at the end.

That approach would completely eliminate the need to do any work in
btvacuumcleanup(), which would make it natural to remove
VACUUM_OPTION_PARALLEL_COND_CLEANUP from nbtree -- the implementation
of btvacuumcleanup() would just look like hashvacuumcleanup() does now
-- it could do practically nothing, making this 100% okay.

For now I have my doubts that it is appropriate to make this change.
It seems as if the question of whether or not
VACUUM_OPTION_PARALLEL_COND_CLEANUP should be used is basically the
same question as "Does the vacuumcleanup() callback for this index AM
look exactly like hashvacuumcleanup()?".

> IMO we can live with the current
> configuration just in case where the user runs into such rare
> situations (especially for the latter case). In most cases, parallel
> vacuum workers for index cleanup might exit with no-op but the
> side-effect (wasting resources and overhead etc) would not be big. If
> we want to enable it only in particular cases, we would need to have
> another way for index AM to tell lazy vacuum whether or not to allow a
> parallel worker to process the index at that time. What do you think?

I am concerned about unintended consequences, like never noticing that
we should really recycle known deleted pages not yet placed in the FSM
(it's hard to think through very rare cases like this with
confidence). Is it really so bad if we launch parallel workers that we
don't really need for a parallel VACUUM?

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2021-03-24 03:11:47 Re: New IndexAM API controlling index vacuum strategies
Previous Message Thomas Munro 2021-03-24 03:08:13 Re: Add client connection check during the execution of the query