RE: logical replication empty transactions

From: "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>
To: 'Ajin Cherian' <itsajin(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Rahila Syed <rahila(dot)syed(at)2ndquadrant(dot)com>, Euler Taveira <euler(at)timbira(dot)com(dot)br>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Craig Ringer <craig(at)2ndquadrant(dot)com>
Subject: RE: logical replication empty transactions
Date: 2022-02-16 03:15:20
Message-ID: TYCPR01MB837337FF0019ED0AAA7A56D9ED359@TYCPR01MB8373.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi

I'll quote one other remaining discussion of this thread again
to invoke more attentions from the community.
On Friday, August 13, 2021 8:01 PM Ajin Cherian <itsajin(at)gmail(dot)com> wrote:
> On Mon, Aug 2, 2021 at 7:20 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> wrote:
> > Few other miscellaneous comments:
> > 1.
> > static void
> > pgoutput_commit_prepared_txn(LogicalDecodingContext *ctx,
> > ReorderBufferTXN *txn,
> > - XLogRecPtr commit_lsn)
> > + XLogRecPtr commit_lsn, XLogRecPtr prepare_end_lsn, TimestampTz
> > + prepare_time)
> > {
> > + PGOutputTxnData *txndata = (PGOutputTxnData *)
> txn->output_plugin_private;
> > +
> > OutputPluginUpdateProgress(ctx);
> >
> > + /*
> > + * If the BEGIN PREPARE was not yet sent, then it means there were no
> > + * relevant changes encountered, so we can skip the COMMIT PREPARED
> > + * message too.
> > + */
> > + if (txndata)
> > + {
> > + bool skip = !txndata->sent_begin_txn; pfree(txndata);
> > + txn->output_plugin_private = NULL;
> >
> > How is this supposed to work after the restart when prepared is sent
> > before the restart and we are just sending commit_prepared after
> > restart? Won't this lead to sending commit_prepared even when the
> > corresponding prepare is not sent? Can we think of a better way to
> > deal with this?
> >
>
> I have tried to resolve this by adding logic in worker,c to silently ignore spurious
> commit_prepareds. But this change required checking if the prepare exists on
> the subscriber before attempting the commit_prepared but the current API that
> checks this requires prepare time and transaction end_lsn. But for this I had to
> change the protocol of commit_prepared, and I understand that this would
> break backward compatibility between subscriber and publisher (you have
> raised this issue as well).
> I am not sure how else to handle this, let me know if you have any other ideas.
I feel if we don't want to change the protocol of commit_prepared,
we need to make the publisher solely judge whether the prepare was empty or not,
after the restart.

One idea I thought at the beginning was to utilize and apply
the existing mechanism to spill ReorderBufferSerializeTXN object to local disk,
by postponing the prepare txn object cleanup and when the walsender exits
and commit prepared didn't come, spilling the transaction's data,
then restoring it after the restart in the DecodePrepare.
However, this idea wasn't crash-safe fundamentally. It means,
if the publisher crashes before spilling the empty prepare transaction,
we fail to detect the prepare was empty and come down to send the commit_prepared
in the situation where the subscriber didn't get the prepare data again.
So, I thought to utilize the spill mechanism didn't work for this purpose.

Another idea would be, to create an empty file under the the pg_replslot/slotname
with a prefix different from "xid" in the DecodePrepare before the shutdown
if the prepare was empty, and bypass the cleanup of the serialized txns
and check the existence after the restart. But, this is pretty ad-hoc and I wasn't sure
if to address the corner case of the restart has the strong enough justification
to create this new file format.

Therefore, in my humble opinion, the idea of protocol change slightly wins,
since the impact of the protocol change would not be big. We introduced
the protocol version 3 in the devel version and the number of users should be little.

Best Regards,
Takamichi Osumi

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2022-02-16 03:24:36 Re: pg_walinspect - a new extension to get raw WAL data and WAL stats
Previous Message John Naylor 2022-02-16 03:14:47 Re: initdb / bootstrap design