Re: [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

From: "Wood, Dan" <hexpert(at)amazon(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, "Wong, Yi Wen" <yiwong(at)amazon(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple
Date: 2017-10-11 02:31:13
Message-ID: 2BD5E9A0-4166-4CD6-8A66-2293AFE533AE@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

I found one glitch with our merge of the original dup row fix. With that corrected AND Alvaro’s Friday fix things are solid.
No dup’s. No index corruption.

Thanks so much.

On 10/10/17, 7:25 PM, "Michael Paquier" <michael(dot)paquier(at)gmail(dot)com> wrote:

On Tue, Oct 10, 2017 at 11:14 PM, Alvaro Herrera
<alvherre(at)alvh(dot)no-ip(dot)org> wrote:
> I was seeing just the reindex problem. I don't see any more dups.
>
> But I've tried to reproduce it afresh now, and let it run for a long
> time and nothing happened. Maybe I made a mistake last week and
> ran an unfixed version. I don't see any more problems now.

Okay, so that's one person more going to this trend, making three with
Peter and I.

>> If you are getting the dup rows consider the code in the block in
>> heapam.c that starts with the comment “replace multi by update xid”.
>>
>> When I repro this I find that MultiXactIdGetUpdateXid() returns 0.
>> There is an updater in the multixact array however the status is
>> MultiXactStatusForNoKeyUpdate and not MultiXactStatusNoKeyUpdate. I
>> assume this is a preliminary status before the following row in the
>> hot chain has it’s multixact set to NoKeyUpdate.
>
> Yes, the "For" version is the locker version rather than the actual
> update. That lock is acquired by EvalPlanQual locking the row just
> before doing the update. I think GetUpdateXid has no reason to return
> such an Xid, since it's not an update.
>
>> Since a 0 is returned this does precede cutoff_xid and
>> TransactionIdDidCommit(0) will return false. This ends up aborting
>> the multixact on the row even though the real xid is committed. This
>> sets XMAX to 0 and that row becomes visible as one of the dups.
>> Interestingly the real xid of the updater is 122944 and the cutoff_xid
>> is 122945.
>
> I haven't seen this effect. Please keep us updated if you're able to
> verify corruption this way.

Me neither. It would be nice to not live long with such a sword of Damocles.
--
Michael

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Michael Paquier 2017-10-11 02:35:20 Re: [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple
Previous Message Michael Paquier 2017-10-11 02:25:46 Re: [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

Browse pgsql-hackers by date

  From Date Subject
Next Message Joe Conway 2017-10-11 02:31:21 Re: pg_regress help output
Previous Message Michael Paquier 2017-10-11 02:25:46 Re: [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple