Re: Failed recovery with new faster 2PC code

From: Nikhil Sontakke <nikhils(at)2ndquadrant(dot)com>
To: Stas Kelvich <s(dot)kelvich(at)postgrespro(dot)ru>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>
Subject: Re: Failed recovery with new faster 2PC code
Date: 2017-04-18 06:39:45
Message-ID: CAMGcDxeka9UssT4GuE-4eHmCoYuJ92OcqebcnYzPzTNOB3YS6g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 17 April 2017 at 15:02, Nikhil Sontakke <nikhils(at)2ndquadrant(dot)com> wrote:

>
>
>> >> commit 728bd991c3c4389fb39c45dcb0fe57e4a1dccd71
>> >> Author: Simon Riggs <simon(at)2ndQuadrant(dot)com>
>> >> Date: Tue Apr 4 15:56:56 2017 -0400
>> >>
>> >> Speedup 2PC recovery by skipping two phase state files in normal
>> path
>> >
>> > Thanks Jeff for your tests.
>> >
>> > So that's now two crash bugs in as many days and lack of clarity about
>> > how to fix it.
>> >
>>
>
> The issue seems to be that a prepared transaction is yet to be committed.
But autovacuum comes in and causes the clog to be truncated beyond this
prepared transaction ID in one of the runs.

We only add the corresponding pgproc entry for a surviving 2PC transaction
on completion of recovery. So could be a race condition here. Digging in
further.

Regards,
Nikhils
--
Nikhil Sontakke http://www.2ndQuadrant.com/
PostgreSQL/Postgres-XL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kang Yuzhe 2017-04-18 06:45:01 Re: On How To Shorten the Steep Learning Curve Towards PG Hacking...
Previous Message Kang Yuzhe 2017-04-18 06:31:20 Re: On How To Shorten the Steep Learning Curve Towards PG Hacking...