Re: [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, "Wong, Yi Wen" <yiwong(at)amazon(dot)com>, "Wood, Dan" <hexpert(at)amazon(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple
Date: 2017-10-06 07:12:47
Message-ID: CAB7nPqRrCA=394EQ4YhJ1bEZBy-R0-_ZL7ofqL95WeHZv_GrTw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On Fri, Oct 6, 2017 at 1:24 AM, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
> I think this is the patch for 9.3. I ran the test a few hundred times
> (with some additional changes such as randomly having an update inside a
> savepoint that's randomly aborted, randomly aborting the transaction,
> randomly skipping the for key share lock, randomly sleeping at a few
> points; and also adding filler columns, reducing fillfactor and using
> 5, 50, 180, 250, 500 sessions after verifying that it causes the tuples
> to stay in the same page or migrate to later pages). The final REINDEX
> has never complained again about failing to find the root tuple. I hope
> it's good now.

I have looked and played with your patch as well for a couple of
hours, and did not notice any failures again. The structure of the
pages looked sane as well, I could see also with pageinspect a correct
HOT chain using the first set of tests provided on this thread.

> The attached patch needs a few small tweaks, such as improving
> commentary in the new function, maybe turn it into a macro (otherwise I
> think it could be bad for performance; I'd like a static func but not
> sure those are readily available in 9.3), change the XID comparison to
> use the appropriate macro rather than ==, and such.

Yeah that would be better.

> Regarding changes of xmin/xmax comparison, I also checked manually the
> spots I thought should be modified and later double-checked against the
> list that Michael posted.

Thanks. I can see see that your patch is filling the holes.

> It's a match, except for rewriteheap.c which
> I cannot make heads or tails about. (I think it's rather unfortunate
> that it sticks a tuple's Xmax into a field that's called Xmin, but let's
> put that aside). Maybe there's a problem here, maybe there isn't.

rewrite_heap_tuple is used only by CLUSTER, and the oldest new tuples
are frozen before being rewritten. So this looks safe to me at the
end.

> I'm now going to forward-port this to 9.4.

+ /*
+ * If the xmax of the old tuple is identical to the xmin of the new one,
+ * it's a match.
+ */
+ if (xmax == xmin)
+ return true;
I would use TransactionIdEquals() here, to remember once you switch
that to a macro.

+/*
+ * Given a tuple, verify whether the given Xmax matches the tuple's Xmin,
+ * taking into account that the Xmin might have been frozen.
+ */
[...]
+ /*
+ * We actually don't know if there's a match, but if the previous tuple
+ * was frozen, we cannot really rely on a perfect match.
+ */
--
Michael

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Noah Misch 2017-10-06 07:30:55 Re: pgsql: Remove ICU tests from default run
Previous Message Tom Lane 2017-10-06 04:07:27 Re: pgsql: Use OpenSSL EVP API for symmetric encryption in pgcrypto.

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2017-10-06 07:30:55 Re: pgsql: Remove ICU tests from default run
Previous Message Etsuro Fujita 2017-10-06 06:49:10 Re: Comment typo