Re: StandbyRecoverPreparedTransactions recovers subtrans links incorrectly

From: Andres Freund <andres(at)anarazel(dot)de>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: StandbyRecoverPreparedTransactions recovers subtrans links incorrectly
Date: 2017-04-24 18:59:44
Message-ID: 20170424185944.bjpb3yizeybv75hd@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2017-04-24 13:29:11 +0100, Simon Riggs wrote:
> On 24 April 2017 at 00:25, Andres Freund <andres(at)anarazel(dot)de> wrote:
> > if the subxid->xid mapping doesn't actually exist - as it's the case
> > with this bug afaics - we'll not get the correct toplevel
> > transaction.
>
> The nature of the corruption is that in some cases
> * a subxid will point to nothing (even though in most cases it was
> already set correctly)
> * the parent will point to a subxid

Right. Those cases aren't that different from the point of trying to
find the parent of an subxid.

> > Which'll mean the following block:
> > /*
> > * We now have either a top-level xid higher than xmin or an
> > * indeterminate xid. We don't know whether it's top level or subxact
> > * but it doesn't matter. If it's present, the xid is visible.
> > */
> > for (j = 0; j < snapshot->subxcnt; j++)
> > {
> > if (TransactionIdEquals(xid, snapshot->subxip[j]))
> > return true;
> > }
> > won't work correctly if suboverflowed.
>
> Your example of snapshots taken during recovery is not correct.

Oh?

> Note that SubTransGetTopmostTransaction() returns a valid, running
> xid, even though it is the wrong one.

Sure.

> Snapshots work differently on standbys - we store all known running
> xids, so the test still passes correctly, even when overflowed.

I don't think that's generally true. Isn't that precisely what
ProcArrayStruct->lastOverflowedXid is about? If we have a snapshot
that's suboverflowed due to the lastOverflowedXid cutoff, then we the
subxip array does *not* contain all known running xids anymore, we rely
on pg_subtrans to only guarantee that toplevel xids are stored in the
KnownAssignedXids machinery.

See:
* When we throw away subXIDs from KnownAssignedXids, we need to keep track of
* that, similarly to tracking overflow of a PGPROC's subxids array. We do
* that by remembering the lastOverflowedXID, ie the last thrown-away subXID.
* As long as that is within the range of interesting XIDs, we have to assume
* that subXIDs are missing from snapshots. (Note that subXID overflow occurs
* on primary when 65th subXID arrives, whereas on standby it occurs when 64th
* subXID arrives - that is not an error.)

/*
* Highest subxid that has been removed from KnownAssignedXids array to
* prevent overflow; or InvalidTransactionId if none. We track this for
* similar reasons to tracking overflowing cached subxids in PGXACT
* entries. Must hold exclusive ProcArrayLock to change this, and shared
* lock to read it.
*/
TransactionId lastOverflowedXid;

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2017-04-24 19:13:16 Re: to-do item for explain analyze of hash aggregates?
Previous Message Andres Freund 2017-04-24 18:52:13 Re: to-do item for explain analyze of hash aggregates?