Re: StandbyRecoverPreparedTransactions recovers subtrans links incorrectly

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: StandbyRecoverPreparedTransactions recovers subtrans links incorrectly
Date: 2017-04-23 17:33:21
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> On 23 April 2017 at 00:55, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> It's not clear to me how much potential this has to create user data
>> corruption, but it doesn't look good at first glance. Discuss.

> SubTransSetParent() only matters for the case where we are half-way
> through a commit with a large commit. Since we do atomic updates of
> commit and subcommmit when on same page, this problem would only
> matter when top level xid and other subxids were separated across
> multiple pages and the issue would only affect a read only query
> checking visibility during the commit for that prepared transaction.
> Furthermore, the nature of the corruption is that we set the xid to
> point to the subxid; since we never mark a top-level commit as
> subcommitted, subtrans would never be consulted and so the corruption
> would not lead to any incorrect answer even during the window of
> exposure. So it looks to me like this error shouldn't cause a problem.

Still trying to wrap my head around this argument ... I agree that
incorrectly setting the parent's pg_subtrans entry can't cause a
visible problem, because it will never be consulted. However, failing
to set the child's entry seems like it could cause transient problems.
There would only be a risk during the eventual commit of the prepared
transaction, and only when the pg_xact entries span multiple pages,
but then there would be a window where the child xact is marked
subcommitted but it has a zero entry in pg_subtrans. That would
result in a WARNING from TransactionIdDidCommit, and a false result,
which depending on timing might mean that commit of the overall
transaction appears non-atomic to onlookers. (That is, the parent
xact might already appear committed to them, but the subtransaction
not yet.)

I can't find any indication in the archives that anyone's ever reported
seeing that WARNING, though that might only mean that they'd not noticed
it. But in any case, it seems like the worst consequence is a narrow
window for seeing non-atomic commit of a prepared transaction on a standby
server. That seems pretty unlikely to cause real-world problems.

The other point about overwriteOK not being set when it needs to be
also seems like it could not cause any problems in production builds,
since that flag is only examined by an Assert.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2017-04-23 17:41:41 Re: StandbyRecoverPreparedTransactions recovers subtrans links incorrectly
Previous Message Nikolay Shaplov 2017-04-23 17:19:38 Re: pgbench tap tests & minor fixes