Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

From: Andres Freund <andres(at)anarazel(dot)de>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: "Wood, Dan" <hexpert(at)amazon(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, pgsql-committers(at)postgresql(dot)org, "Wong, Yi Wen" <yiwong(at)amazon(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple
Date: 2017-11-14 03:03:41
Message-ID: 20171114030341.movhteyakqeqx5pm@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

Hi,

On 2017-11-03 07:53:30 -0700, Andres Freund wrote:
> Here's that patch. I've stared at this some, and Robert did too. Robert
> mentioned that the commit message might need some polish and I'm not
> 100% sure about the error message texts yet.
>
> I'm not yet convinced that the new elog in vacuumlazy can never trigger
> - but I also don't think we want to actually freeze the tuple in that
> case.

I'm fairly sure it could be triggered, therefore I've rewritten that.

I've played around quite some with the attached patch. So far, after
applying the second patch, neither VACUUM nor VACUUM FULL / CLUSTER make
the situation worse for already existing corruption. HOT pruning can
change the exact appearance of existing corruption a bit, but I don't
think it can make the corruption meaningfully worse. It's a bit
annoying and scary to add so many checks to backbranches but it kinda
seems required. The error message texts aren't perfect, but these are
"should never be hit" type elog()s so I'm not too worried about that.

Please review!

One thing I'm wondering about is whether we have any way for users to
diagnose whether they need to dump & restore - I can't really think of
anything actually meaningful. Reindexing will find some instances, but
far from all. VACUUM (disable_page_skipping, freeze) pg_class will
also find a bunch. Not a perfect story.

> Staring at the vacuumlazy hunk I think I might have found a related bug:
> heap_update_tuple() just copies the old xmax to the new tuple's xmax if
> a multixact and still running. It does so without verifying liveliness
> of members. Isn't that buggy? Consider what happens if we have three
> blocks: 1 has free space, two is being vacuumed and is locked, three is
> full and has a tuple that's key share locked by a live tuple and is
> updated by a dead xmax from before the xmin horizon. In that case afaict
> the multi will be copied from the third page to the first one. Which is
> quite bad, because vacuum already processed it, and we'll set
> relfrozenxid accordingly. I hope I'm missing something here?

Trying to write a testcase for that now.

Greetings,

Andres Freund

Attachment Content-Type Size
0001-Fix-pruning-of-locked-and-updated-tuples.patch text/x-diff 11.4 KB
0002-Perform-a-lot-more-sanity-checks-when-freezing-tuple.patch text/x-diff 10.7 KB

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Andres Freund 2017-11-14 04:07:01 Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple
Previous Message Michael Paquier 2017-11-14 00:32:51 Re: [COMMITTERS] pgsql: Add hash partitioning.

Browse pgsql-hackers by date

  From Date Subject
Next Message Connor Wolf 2017-11-14 03:08:28 Re: [HACKERS] How to implement a SP-GiST index as a extension module?
Previous Message Masahiko Sawada 2017-11-14 02:56:00 Re: [HACKERS] ginInsertCleanup called from vacuum could still miss tuples to be deleted