Re: [HACKERS] logical decoding of two-phase transactions

From: Ajin Cherian <itsajin(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] logical decoding of two-phase transactions
Date: 2020-11-27 13:05:04
Message-ID: CAFPTHDZKRBQSywbHDPM7FKRsb5NnurZ_xXf1qkVdt4ddx+1F5A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Nov 26, 2020 at 10:43 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:

> I think what you need to do to reproduce this is to follow the
> snapshot machinery in SnapBuildFindSnapshot. Basically, first, start a
> transaction (say transaction-id is 500) and do some operations but
> don't commit. Here, if you create a slot (via subscription or
> otherwise), it will wait for 500 to complete and make the state as
> SNAPBUILD_BUILDING_SNAPSHOT. Here, you can commit 500 and then having
> debugger in that state, start another transaction (say 501), do some
> operations but don't commit. Next time when you reach this function,
> it will change the state to SNAPBUILD_FULL_SNAPSHOT and wait for 501,
> now you can start another transaction (say 502) which you can prepare
> but don't commit. Again start one more transaction 503, do some ops,
> commit both 501 and 503. At this stage somehow we need to ensure that
> XLOG_RUNNING_XACTS record. Then commit prepared 502. Now, I think you
> should notice that the consistent point is reached after 502's prepare
> and before its commit. Now, this is just a theoretical scenario, you
> need something on these lines and probably a way to force
> XLOG_RUNNING_XACTS WAL (probably via debugger or some other way) at
> the right times to reproduce it.
>
> Thanks for trying to build a test case for this, it is really helpful.

I tried the above steps, I was able to get the builder state to
SNAPBUILD_BUILDING_SNAPSHOT but was not able to get into the
SNAPBUILD_FULL_SNAPSHOT state.
Instead the code moves straight from SNAPBUILD_BUILDING_SNAPSHOT to
SNAPBUILD_CONSISTENT state.

In the function SnapBuildFindSnapshot, either the following check fails:

1327: TransactionIdPrecedesOrEquals(SnapBuildNextPhaseAt(builder),
running->oldestRunningXid))

because the SnapBuildNextPhaseAt (which is same as running->nextXid)
is higher than oldestRunningXid, or when the both are the same, then
it falls through into the below condition higher in the code

1247: if (running->oldestRunningXid == running->nextXid)

and then the builder moves straight into the SNAPBUILD_CONSISTENT
state. At no point will the nextXid be less than oldestRunningXid. In
my sessions, I commit multiple txns, hoping to bump
up oldestRunningXid, I do checkpoints, have made sure the
XLOG_RUNNING_XACTS are being inserted.,
but while iterating into SnapBuildFindSnapshot with a ,new
XLOG_RUNNING_XACTS:record, the oldestRunningXid is being incremented
at one xid at a time, which will eventually make it catch up
running->nextXid and reach a
SNAPBUILD_CONSISTENT state without entering the SNAPBUILD_FULL_SNAPSHOT state.

regards,
Ajin Cherian
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message osumi.takamichi@fujitsu.com 2020-11-27 13:34:49 RE: Disable WAL logging to speed up data loading
Previous Message Fujii Masao 2020-11-27 12:51:16 Re: autovac issue with large number of tables