Re: StandbyRecoverPreparedTransactions recovers subtrans links incorrectly

From: Nikhil Sontakke <nikhils(at)2ndquadrant(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: StandbyRecoverPreparedTransactions recovers subtrans links incorrectly
Date: 2017-04-26 14:00:50
Message-ID: CAMGcDxfLxpb4iTQVeeJ3s=r4ZEtaHS5L4f_gj8CZtJ_69PcBzw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> I'm suggesting we take the approach that if there is a problem we can
> recreate it as a way of exploring what conditions are required and
> therefore work out the impact. Nikhil Sontakke appears to have
> re-created something, but not quite what I had expected. I think he
> will post here tomorrow with an update for us to discuss.
>
>

So, I reverted commit 0874d4f3e183757ba15a4b3f3bf563e0393dd9c2 to go back
to the earlier bad swapped arguments to SubTransSetParent resulting in
incorrect parent linkages and used the attached TAP test patch.

The test prepares a 2PC with more than 64 subtransactions. It then stops
the master and promotes the standby.

A SELECT query on the newly promoted master on any of the tables involved
in the 2PC hangs. The hang is due to a loop in
SubTransGetTopmostTransaction(). Due to incorrect linkages, we get a
circular reference in parentxid <-> subxid inducing the infinite loop.

Any further DML on these objects which will need to check visibility of
these tuples hangs as well. All unrelated objects and new transactions are
ok AFAICS.

I do not see any data loss, which is good. However tables involved in the
2PC are inaccessible till after a hard restart.

The attached TAP test patch can be considered for commit to test handling
2PC with large subtransactions on promoted standby instances.

Regards,
Nikhils
--
Nikhil Sontakke http://www.2ndQuadrant.com/
PostgreSQL/Postgres-XL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
subxid_bug_with_test_case_v1.0.patch application/octet-stream 3.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2017-04-26 14:08:05 Re: PG 10 release notes
Previous Message Rahila Syed 2017-04-26 13:42:49 Re: Adding support for Default partition in partitioning