Re: 64-bit XIDs in deleted nbtree pages

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: 64-bit XIDs in deleted nbtree pages
Date: 2021-03-25 07:03:27
Message-ID: CAD21AoABxdfUs4dJhOmj1JwwEviX7UMaJqijYpxV1G5KJUr45g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 24, 2021 at 12:10 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>
> On Tue, Mar 23, 2021 at 8:14 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > By this patch series, btree indexes became like hash indexes in terms
> > of amvacuumcleanup. We do an index scan at btvacuumcleanup() in the
> > two cases: metapage upgrading and more than 5%
> > deleted-but-not-yet-recycled pages. Both cases seem rare cases. So do
> > we want to disable parallel index cleanup for btree indexes like hash
> > indexes? That is, remove VACUUM_OPTION_PARALLEL_COND_CLEANUP from
> > amparallelvacuumoptions.
>
> My recent "Recycle nbtree pages deleted during same VACUUM" commit
> improved the efficiency of recycling, but I still think that it was a
> bit of a hack. Or at least it didn't go far enough in fixing the old
> design, which is itself a bit of a hack.
>
> As I said back on February 15, a truly good design for nbtree page
> deletion + recycling would have crash safety built in. If page
> deletion itself is crash safe, it really makes sense to make
> everything crash safe (especially because we're managing large chunks
> of equisized free space, unlike in heapam). And as I also said back
> then, a 100% crash-safe design could naturally shift the problem of
> nbtree page recycle safety from the producer/VACUUM side, to the
> consumer/_bt_getbuf() side. It should be completely separated from
> when VACUUM runs, and what VACUUM can discover about recycle safety in
> passing, at the end.
>
> That approach would completely eliminate the need to do any work in
> btvacuumcleanup(), which would make it natural to remove
> VACUUM_OPTION_PARALLEL_COND_CLEANUP from nbtree -- the implementation
> of btvacuumcleanup() would just look like hashvacuumcleanup() does now
> -- it could do practically nothing, making this 100% okay.
>
> For now I have my doubts that it is appropriate to make this change.
> It seems as if the question of whether or not
> VACUUM_OPTION_PARALLEL_COND_CLEANUP should be used is basically the
> same question as "Does the vacuumcleanup() callback for this index AM
> look exactly like hashvacuumcleanup()?".
>
> > IMO we can live with the current
> > configuration just in case where the user runs into such rare
> > situations (especially for the latter case). In most cases, parallel
> > vacuum workers for index cleanup might exit with no-op but the
> > side-effect (wasting resources and overhead etc) would not be big. If
> > we want to enable it only in particular cases, we would need to have
> > another way for index AM to tell lazy vacuum whether or not to allow a
> > parallel worker to process the index at that time. What do you think?
>
> I am concerned about unintended consequences, like never noticing that
> we should really recycle known deleted pages not yet placed in the FSM
> (it's hard to think through very rare cases like this with
> confidence). Is it really so bad if we launch parallel workers that we
> don't really need for a parallel VACUUM?

I don't think it's too bad even if we launch parallel workers for
indexes that don’t really need to be processed by parallel workers.
Parallel workers exit immediately after all indexes are vacuumed so it
would not affect other parallel operations. There is nothing change in
terms of in terms of DSM usage since btree indexes support parallel
bulkdelete.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2021-03-25 07:09:43 Re: [HACKERS] logical decoding of two-phase transactions
Previous Message Kyotaro Horiguchi 2021-03-25 06:40:34 Re: [CLOBBER_CACHE]Server crashed with segfault 11 while executing clusterdb