Re: pgsql: snapshot scalability: cache snapshots using a xact completion co

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: Re: pgsql: snapshot scalability: cache snapshots using a xact completion co
Date: 2020-08-18 23:45:32
Message-ID: 20200818234532.uiafo5br5lo6zhya@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On 2020-08-18 13:28:05 -0700, Andres Freund wrote:
> Hi,
>
> On 2020-08-18 01:21:17 -0400, Tom Lane wrote:
> > Andres Freund <andres(at)anarazel(dot)de> writes:
> > > I'd written to Tom that I was planning to revert unless the number of
> > > failures were lower than initially indicated. But that actually seems to
> > > have come to pass (the failures are quicker to report because they don't
> > > run the subsequent tests, of course). I'd like to let the failures
> > > accumulate a bit longer, say until tomorrow Midday if I haven't figured
> > > it out by then. With the hope of finding some detail to help pinpoint
> > > the issue.
> >
> > There's certainly no obvious pattern here, so I agree with waiting for
> > more data.
>
> FWIW, I think I have found the bug, but I'm still working to reproduce
> the issue reliably enough that I can verify that the fix actually works.
>
> The issue is basically that 2PC PREPARE is weird, WRT procarray. The
> last snapshot built with GetSnapshotData() before the PREPARE doesn't
> include its own transaction in ->xip[], as normal. PrepareTransaction()
> removes the "normal" entry with ProcArrayClearTransaction(), which so
> far doesn't increase the xact completion count. Because the xact
> completion count is not increased, snapshots can be reused as long as
> they're taken before the 2PC transaction is finished. That's fine for
> other backends, but for the backend doing the PrepareTransaction() it's
> not, because there ->xip doesn't include the own backend.
>
> It's a bit tricky to reproduce exactly the issue the BF is occasionally
> hitting, because the way ->xmax is computed *limits* the
> damage. Combined with the use of SERIALIZABLE (preventing recomputation
> of the data snapshot) that makes it somewhat hard to hit.

I pushed a fix. After a while I figured out that it's not actually that
hard to test reliably. But it does require multiple sessions
interacting, particularly another session needs to acquire and commit a
transaction id that's later than the prepared transaction's.

I think it's worth adding an isolation test. But it doesn't seem like
extending prepared-transactions.spec makes too much sense, it doesn't
fit in well. It's a lot easier to reproduce the issue without
SERIALIZABLE, for example. Generally the file seems more about
serializable than 2PC...

So unless somebody disagrees I'm gonna add a new
prepared-transactions-2.spec.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Andres Freund 2020-08-18 23:51:52 Re: pgsql: Fix race condition in snapshot caching when 2PC is used.
Previous Message Tom Lane 2020-08-18 23:43:24 Re: pgsql: Fix race condition in snapshot caching when 2PC is used.

Browse pgsql-hackers by date

  From Date Subject
Next Message k.jamison@fujitsu.com 2020-08-19 00:20:50 RE: please update ps display for recovery checkpoint
Previous Message Justin Pryzby 2020-08-18 22:52:39 please update ps display for recovery checkpoint