Re: Bug in two-phase transaction recovery

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Stas Kelvich <s(dot)kelvich(at)postgrespro(dot)ru>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: Bug in two-phase transaction recovery
Date: 2016-09-08 06:43:06
Message-ID: CAB7nPqQDOuQCFEXjHa_GQVQ8tRyjA=c63sVSf7dMR_9NUt0TnA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 7, 2016 at 10:48 PM, Stas Kelvich <s(dot)kelvich(at)postgrespro(dot)ru> wrote:
> Some time ago two-phase state file format was changed to have variable size GID,
> but several places that read that files were not updated to use new offsets. Problem
> exists in master and 9.6 and can be reproduced on prepared transactions with
> savepoints.

Oops and meh. This meritates an open item, and has better be fixed by
9.6.0. I am glad you noticed that honestly. And we had better take
care of this issue as soon as possible.

> For example:
>
> create table t(id int);
> begin;
> insert into t values (42);
> savepoint s1;
> insert into t values (43);
> prepare transaction 'x';
> begin;
> insert into t values (142);
> savepoint s1;
> insert into t values (143);
> prepare transaction 'y’;
>
> and restart the server. Startup process will hang. Fix attached.

In crash recovery this is very likely to fail with an assertion if
those are enabled:
TRAP: FailedAssertion("!(TransactionIdFollows(subxid, xid))", File:
"twophase.c", Line: 1767)
And the culprit is e0694cf9, which makes this open item owned by Simon.

I also had a look at the patch you are proposing, and I think that's a
correct fix. I also looked at the other code paths scanning the 2PC
state file and did not notice extra problems. The good news is that
the 2PC file generation is not impacted, only its scan at recovery is,
so the impact of the bug is limited for existing 9.6 deployments where
both 2PC and subtransactions are involved.

> Also while looking at StandbyRecoverPreparedTransactions() i’ve noticed that buffer
> for 2pc file is allocated in TopMemoryContext but never freed. That probably exists
> for a long time.

@@ -1886,6 +1886,8 @@ StandbyRecoverPreparedTransactions(bool overwriteOK)
Assert(TransactionIdFollows(subxid, xid));
SubTransSetParent(xid, subxid, overwriteOK);
}
+
+ pfree(buf);
}
This one is a good catch. I have checked also the other callers of
ReadTwoPhaseFile but I am not seeing any other leak. That's a leak,
not critical though so applying it only on HEAD would be enough IMO.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Haribabu Kommi 2016-09-08 06:46:11 Re: Truncating/vacuuming relations on full tablespaces
Previous Message Михаил Бахтерев 2016-09-08 06:39:31 Re: GiST penalty functions [PoC]