Re: [HACKERS] pgsql: Fix freezing of a dead HOT-updated tuple

From: Andres Freund <andres(at)anarazel(dot)de>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>,Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: "Wood, Dan" <hexpert(at)amazon(dot)com>,pgsql-committers(at)postgresql(dot)org,"Wong, Yi Wen" <yiwong(at)amazon(dot)com>,Michael Paquier <michael(dot)paquier(at)gmail(dot)com>,Robert Haas <robertmhaas(at)gmail(dot)com>,Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>,pgsql-hackers(at)postgresql(dot)org
Subject: Re: [HACKERS] pgsql: Fix freezing of a dead HOT-updated tuple
Date: 2017-11-03 20:33:45
Message-ID: 698409B0-5E70-4F88-A07B-7460C674874B@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On November 4, 2017 1:22:04 AM GMT+05:30, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
>Peter Geoghegan wrote:
>> Andres Freund <andres(at)anarazel(dot)de> wrote:
>
>> > Staring at the vacuumlazy hunk I think I might have found a related
>bug:
>> > heap_update_tuple() just copies the old xmax to the new tuple's
>xmax if
>> > a multixact and still running. It does so without verifying
>liveliness
>> > of members. Isn't that buggy? Consider what happens if we have
>three
>> > blocks: 1 has free space, two is being vacuumed and is locked,
>three is
>> > full and has a tuple that's key share locked by a live tuple and is
>> > updated by a dead xmax from before the xmin horizon. In that case
>afaict
>> > the multi will be copied from the third page to the first one.
>Which is
>> > quite bad, because vacuum already processed it, and we'll set
>> > relfrozenxid accordingly. I hope I'm missing something here?
>>
>> Can you be more specific about what you mean here? I think that I
>> understand where you're going with this, but I'm not sure.
>
>He means that the tuple that heap_update moves to page 1 (which will no
>longer be processed by vacuum) will contain a multixact that's older
>than relminmxid -- because it is copied unchanged by heap_update
>instead
>of properly checking against age limit.

Right. That, or a member xid below relminxid. I think both scenarios are possible.

Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

In response to

Browse pgsql-committers by date

  From Date Subject
Next Message Tom Lane 2017-11-03 21:22:05 pgsql: Avoid looping through line pointers twice in PageRepairFragmenta
Previous Message Tom Lane 2017-11-03 20:31:37 pgsql: Flag index metapages as standard-format in xlog.c calls.

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-11-03 20:34:48 Re: Re: PANIC: invalid index offnum: 186 when processing BRIN indexes in VACUUM
Previous Message Tom Lane 2017-11-03 20:33:44 Re: Setting pd_lower in GIN metapage