Re: xid wraparound danger due to INDEX_CLEANUP false

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: xid wraparound danger due to INDEX_CLEANUP false
Date: 2020-11-20 21:21:42
Message-ID: CAH2-Wz=Tyx51UmmPWD=rUDdfsXCRXKgBO83nfmuy5GL6AjOeFA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Nov 20, 2020 at 12:04 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> That's an interesting idea. We should think about the needs of brin
> indexes when designing something better than the current system. They
> have the interesting property that the heap deciding to change LP_DEAD
> to LP_UNUSED doesn't break anything even if nothing's been done to the
> index, because they don't store TIDs anyway. So that's an example of
> an index AM that might want to do some work to keep performance up,
> but it's not actually required. This might be orthogonal to the
> 0.0-1.0 scale you were thinking about, but it might be good to factor
> it into the thinking somehow.

I actually made this exact suggestion about BRIN myself, several years ago.

As I've said, it seems like it would be a good idea to ask the exact
same generic question of each index in turn (which is answered using
local smarts added to the index AM). Again, the question is: How
important is it that you get vacuumed now, from your own
narrow/selfish point of view? The way that BRIN answers this question
is not the novel thing about BRIN among other index access methods,
though. (Not that you claimed otherwise -- just framing the discussion
carefully.)

BRIN has no selfish reason to care if the table never gets to have its
LP_DEAD line pointers set to LP_UNUSED -- that's just not something
that it can be expected to understand directly. But all index access
methods should be thought of as not caring about this, because it's
just not their problem. (Especially with bottom-up index deletion, but
even without it.)

The interesting and novel thing about BRIN here is this: lazyvacuum.c
can be taught that a BRIN index alone is no reason to have to do a
second pass over the heap (to make the LP_DEAD/pruned-by-VACUUM line
pointers LP_UNUSED). A BRIN index never gets affected by the usual
considerations about the heapam invariant (the usual thing about TIDs
in an index not pointing to a line pointer that is at risk of being
recycled), which presents us with a unique-to-BRIN opportunity. Which
is exactly what you said.

(***Thinks some more***)

Actually, now I think that BRIN shouldn't be special to vacuumlazy.c
in any way. It doesn't make sense as part of this future world in
which index vacuuming can be skipped for individual indexes (which is
what I talked to Sawada-san about a little earlier in this thread).
Why should it be useful to exploit the "no-real-TIDs" property of BRIN
in this future world? It can only solve a problem that the main
enhancement is itself expected to solve without any special help from
BRIN (just the generic am callback that asks the same generic question
about index vacuuming urgency).

The only reason we press ahead with a second scan (the
LP_DEAD-to-LP_UNUSED thing) in this ideal world is a heap/table
problem. The bloat eventually gets out of hand *in the table*. We have
now conceptually decoupled the problems experienced in the table/heap
from the problems for each index (mostly), so this actually makes
sense. The theory behind AV scheduling becomes much closer to reality
-- by changing the reality! (The need to "prune the table to VACUUM
any one index" notwithstanding -- that's still necessary, of course,
but we still basically decouple table bloat from index bloat at the
conceptual level.)

Does that make sense?
--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message James Coleman 2020-11-20 21:30:47 Re: enable_incremental_sort changes query behavior
Previous Message Stephen Frost 2020-11-20 21:13:05 Default role -> Predefined role