Re: snapshot too old issues, first around wraparound and then more.

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Stephen Frost <sfrost(at)snowman(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Noah Misch <noah(at)leadboat(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Kevin Grittner <kgrittn(at)gmail(dot)com>
Subject: Re: snapshot too old issues, first around wraparound and then more.
Date: 2021-06-16 19:08:18
Message-ID: CAH2-WzkwDC9UXiMr7YCHvMzw9Xq92r4cVuK2irrAeGWA621Cfg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jun 16, 2021 at 11:27 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> 2) Modeling when it is safe to remove row versions. It is easy to remove
> a tuple that was inserted and deleted within one "not needed" xid
> range, but it's far less obvious when it is safe to remove row
> versions where prior/later row versions are outside of such a gap.
>
> Consider e.g. an update chain where the oldest snapshot can see one
> row version, then there is a chain of rows that could be vacuumed
> except for the old snapshot, and then there's a live version. If the
> old session updates the row version that is visible to it, it needs
> to be able to follow the xid chain.
>
> This seems hard to solve in general.

As I've said to you before, I think that it would make sense to solve
the problem inside heap_index_delete_tuples() first (for index tuple
deletion) -- implement and advanced version for heap pruning later.
That gives users a significant benefit without requiring that you
solve this hard problem with xmin/xmax and update chains.

I don't think that it matters that index AMs still only have LP_DEAD
bits set when tuples are dead to all snapshots including the oldest.
Now that we can batch TIDs within each call to
heap_index_delete_tuples() to pick up "extra" deletable TIDs from the
same heap blocks, we'll often be able to delete a significant number
of extra index tuples whose TIDs are in a "not needed" range. Whereas
today, without the "not needed" range mechanism in place, we just
delete the index tuples that are LP_DEAD-set already, plus maybe a few
others ("extra index tuples" that are not even needed by the oldest
snapshot) -- but that's it.

We might miss our chance to ever delete the nearby index tuples
forever, just because we missed the opportunity once. Recall that the
LP_DEAD bit being set for an index tuple isn't just information about
the index tuple in Postgres 14+ -- it also suggests that the *heap
block* has many more index tuples that we can delete that aren't
LP_DEAD set in the index. And so nbtree will check those extra nearby
TIDs out in passing within heap_index_delete_tuples(). We currently
lose this valuable hint about the heap block forever if we delete the
LP_DEAD-set index tuples, unless we get lucky and somebody sets a few
more index tuples for the same heap blocks before the next time the
leaf page fills up (and heap_index_delete_tuples() must be called).

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2021-06-16 19:11:41 Re: pgbench logging broken by time logic changes
Previous Message Andres Freund 2021-06-16 19:06:37 Re: snapshot too old issues, first around wraparound and then more.