Re: New IndexAM API controlling index vacuum strategies

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Noah Misch <noah(at)leadboat(dot)com>
Subject: Re: New IndexAM API controlling index vacuum strategies
Date: 2021-03-16 00:00:33
Message-ID: CAH2-WzkRYCXscEkfzO6gqf4fu-SXy-OkiJ2LyJxaE2_UYsY2+Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 15, 2021 at 4:11 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > > I'm not comfortable with this change without adding more safety
> > > checks. If there's ever a case in which the HEAPTUPLE_DEAD case is hit
> > > and the xid needs to be frozen, we'll either cause errors or
> > > corruption. Yes, that's already the case with params->index_cleanup ==
> > > DISABLED, but that's not that widely used.
> >
> > I noticed that Noah's similar 2013 patch [1] added a defensive
> > heap_tuple_needs_freeze() + elog(ERROR) to the HEAPTUPLE_DEAD case. I
> > suppose that that's roughly what you have in mind here?
>
> I'm not sure that's sufficient. If the case is legitimately reachable
> (I'm maybe 60% is not, after staring at it for a long time, but ...),
> then we can't just error out when we didn't so far.

If you're only 60% sure that the heap_tuple_needs_freeze() error thing
doesn't break anything that should work by now then it seems unlikely
that you'll ever get past 90% sure. I think that we should make a
conservative assumption that the defensive elog(ERROR) won't be
sufficient, and proceed on that basis.

> I kinda wonder whether this case should just be handled by just gotoing
> back to the start of the blkno loop, and redoing the pruning. The only
> thing that makes that a bit more complicatd is that we've already
> incremented vacrelstats->{scanned_pages,vacrelstats->tupcount_pages}.

That seems like a good solution to me -- this is a very seldom hit
path, so we can be a bit inefficient without it mattering.

It might make sense to *also* check some things (maybe using
heap_tuple_needs_freeze()) in passing, just for debugging purposes.

> We really should put the per-page work (i.e. the blkno loop body) of
> lazy_scan_heap() into a separate function, same with the
> too-many-dead-tuples branch.

+1.

BTW I've noticed that the code (and code like it) tends to confuse
things that the current VACUUM performed versus things by *some*
VACUUM (that may or may not be current one). This refactoring might be
a good opportunity to think about that as well.

> > * It is assumed that the caller has checked the tuple with
> > * HeapTupleSatisfiesVacuum() and determined that it is not HEAPTUPLE_DEAD
> > * (else we should be removing the tuple, not freezing it).
> >
> > Does that need work too?
>
> I'm pretty scared of the index-cleanup-disabled path, for that reason. I
> think the hot path is more likely to be unproblematic, but I'd not bet
> my (nonexistant) farm on it.

Well if we can solve the problem by simply doing pruning once again in
the event of a HEAPTUPLE_DEAD return value from the lazy_scan_heap()
HTSV call, then the comment becomes 100% true (which is not the case
even today).

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-03-16 00:30:18 Re: [PATCH]: Allow errors in parameter values to be reported during the BIND phase itself..
Previous Message Masahiro Ikeda 2021-03-15 23:50:25 make the stats collector shutdown without writing the statsfiles if the immediate shutdown is requested.