Re: repeated decoding of prepared transactions

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Markus Wanner <markus(dot)wanner(at)enterprisedb(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Ajin Cherian <itsajin(at)gmail(dot)com>, Simon Riggs <simon(at)enterprisedb(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Petr Jelinek <petr(dot)jelinek(at)enterprisedb(dot)com>
Subject: Re: repeated decoding of prepared transactions
Date: 2021-02-09 03:02:42
Message-ID: CAA4eK1J1vD65Y9xsue_nsrJvm5erMU2PKFZu81PBeD+Z30Rbbw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Feb 8, 2021 at 8:36 PM Markus Wanner
<markus(dot)wanner(at)enterprisedb(dot)com> wrote:
>
> Hello Amit,
>
> thanks for your very quick response.
>
> On 08.02.21 11:13, Amit Kapila wrote:
> > /*
> > * It is possible that this transaction is not decoded at prepare time
> > * either because by that time we didn't have a consistent snapshot or it
> > * was decoded earlier but we have restarted. We can't distinguish between
> > * those two cases so we send the prepare in both the cases and let
> > * downstream decide whether to process or skip it. We don't need to
> > * decode the xact for aborts if it is not done already.
> > */
>
> The way I read the surrounding code, the only case a 2PC transaction
> does not get decoded a prepare time is if the transaction is empty. Or
> are you aware of any other situation that might currently happen?
>

We also skip decoding at prepare time if we haven't reached a
consistent snapshot by that time. See below code in DecodePrepare().
DecodePrepare()
{
..
/* We can't start streaming unless a consistent state is reached. */
if (SnapBuildCurrentState(builder) < SNAPBUILD_CONSISTENT)
{
ReorderBufferSkipPrepare(ctx->reorder, xid);
return;
}
..
}

There are other reasons as well like the output plugin doesn't want to
allow decoding at prepare time but I don't think those are relevant to
the discussion here.

> > (unless the server needs to be restarted due to some reason)
>
> Right, the repetition occurs only after a restart of the walsender in
> between a prepare and a commit prepared record.
>
> > That anyway is true without this work as well where restart_lsn can be
> > advanced on commits. We haven't changed anything in that regard.
>
> I did not mean to blame the patch, but merely try to understand some of
> the design decisions behind it.
>
> And as I just learned, even if we managed to avoid the repetition, a
> restarted walsender still needs to see prepared transactions as
> in-progress in its snapshots. So we cannot move forward the restart_lsn
> to after a prepare record (until the final commit or rollback is consumed).
>

Right and say if we forget the prepared transactions and move forward
with restart_lsn once we get the prepare for any transaction. Then we
will open up a window where we haven't actually sent the prepared xact
because of say "snapshot has not yet reached consistent state" and we
have moved the restart_lsn. Then later when we get the commit
corresponding to the prepared transaction by which time say the
"snapshot has reached consistent state" then we will miss sending the
transaction contents and prepare for it. I think for such reasons we
allow restart_lsn to moved only once the transaction is finished
(committed or rolled back).

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Yugo NAGATA 2021-02-09 03:23:23 Re: Is Recovery actually paused?
Previous Message tsunakawa.takay@fujitsu.com 2021-02-09 02:26:25 RE: libpq debug log