Re: pgsql: Fix freezing of a dead HOT-updated tuple

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, "Wood, Dan" <hexpert(at)amazon(dot)com>, pgsql-committers <pgsql-committers(at)postgresql(dot)org>, "Wong, Yi Wen" <yiwong(at)amazon(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgsql: Fix freezing of a dead HOT-updated tuple
Date: 2017-11-02 16:47:26
Message-ID: CAH2-Wzn42P7Vo7tb9iWJDBDmzmboPXRGeP1AabFQ_9VYyL3isQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On Thu, Nov 2, 2017 at 4:20 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> I think the problem is on the pruning, rather than the freezing side. We
> can't freeze a tuple if it has an alive predecessor - rather than
> weakining this, we should be fixing the pruning to not have the alive
> predecessor.

Excellent catch.

> If the update xmin is actually below the cutoff we can remove the tuple
> even if there's live lockers - the lockers will also be present in the
> newer version of the tuple. I verified that for me that fixes the
> problem. Obviously that'd require some comment work and more careful
> diagnosis.

I didn't even know that that was safe.

> I think a5736bf754c82d8b86674e199e232096c679201d might be dangerous in
> the face of previously corrupted tuple chains and pg_upgraded clusters -
> it can lead to tuples being considered related, even though they they're
> from entirely independent hot chains. Especially when upgrading 9.3 post
> your fix, to current releases.

Frankly, I'm relieved that you got to this. I was highly suspicious of
a5736bf754c82d8b86674e199e232096c679201d, even beyond my specific,
actionable concern about how it failed to handle the
9.3/FrozenTransactionId xmin case as special. As I went into in the
"heap/SLRU verification, relfrozenxid cut-off, and freeze-the-dead
bug" thread, these commits left us with a situation where there didn't
seem to be a reliable way of knowing whether or not it is safe to
interrogate clog for a given heap tuple using a tool like amcheck.
And, it wasn't obvious that you couldn't have a codepath that failed
to account for pre-cutoff non-frozen tuples -- codepaths that call
TransactionIdDidCommit() despite it actually being unsafe.

If I'm not mistaken, your proposed fix restores sanity there.

--
Peter Geoghegan

In response to

Browse pgsql-committers by date

  From Date Subject
Next Message Tom Lane 2017-11-02 16:54:49 pgsql: Fix corner-case errors in brin_doupdate().
Previous Message Peter Eisentraut 2017-11-02 16:45:42 pgsql: Remove wal_keep_segments from default configuration in PostgresN

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2017-11-02 16:48:11 Re: Removing wal_keep_segments as default configuration in PostgresNode.pm
Previous Message Peter Eisentraut 2017-11-02 16:47:05 Re: Removing wal_keep_segments as default configuration in PostgresNode.pm