Re: logical replication empty transactions

From: Ajin Cherian <itsajin(at)gmail(dot)com>
To: "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Rahila Syed <rahila(dot)syed(at)2ndquadrant(dot)com>, Euler Taveira <euler(at)timbira(dot)com(dot)br>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Craig Ringer <craig(at)2ndquadrant(dot)com>
Subject: Re: logical replication empty transactions
Date: 2022-03-02 13:40:00
Message-ID: CAFPTHDb5JExY2zgx-_6EfBUoeBs5LOxYAH-TT3AxrH8FiSL9UQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 2, 2022 at 1:01 PM shiy(dot)fnst(at)fujitsu(dot)com
<shiy(dot)fnst(at)fujitsu(dot)com> wrote:
>
> 4.
> @@ -1617,9 +1829,21 @@ pgoutput_stream_prepare_txn(LogicalDecodingContext *ctx,
> ReorderBufferTXN *txn,
> XLogRecPtr prepare_lsn)
> {
> + PGOutputTxnData *txndata = txn->output_plugin_private;
> + bool sent_begin_txn = txndata->sent_begin_txn;
> +
> Assert(rbtxn_is_streamed(txn));
>
> - OutputPluginUpdateProgress(ctx);
> + pfree(txndata);
> + txn->output_plugin_private = NULL;
> +
> + if (!sent_begin_txn)
> + {
> + elog(DEBUG1, "Skipping replication of an empty transaction in stream prepare");
> + return;
> + }
> +
> + OutputPluginUpdateProgress(ctx, false);
> OutputPluginPrepareWrite(ctx, true);
> logicalrep_write_stream_prepare(ctx->out, txn, prepare_lsn);
> OutputPluginWrite(ctx, true);
>
> I notice that the patch skips stream prepared transaction, this would cause an
> error on subscriber side when committing this transaction on publisher side, so
> I think we'd better not do that.
>
> For example:
> (set logical_decoding_work_mem = 64kB, max_prepared_transactions = 10 in
> postgresql.conf)
>
> -- publisher
> create table test (a int, b text, primary key(a));
> create table test2 (a int, b text, primary key(a));
> create publication pub for table test;
>
> -- subscriber
> create table test (a int, b text, primary key(a));
> create table test2 (a int, b text, primary key(a));
> create subscription sub connection 'dbname=postgres port=5432' publication pub with(two_phase=on, streaming=on);
>
> -- publisher
> begin;
> INSERT INTO test2 SELECT i, md5(i::text) FROM generate_series(1, 1000) s(i);
> prepare transaction 't';
> commit prepared 't';
>
> The error message in subscriber log:
> ERROR: prepared transaction with identifier "pg_gid_16391_722" does not exist
>

Thanks for the test. I guess this mixed streaming+two-phase runs into
the same problem that
was there while skipping two-phased transactions. If the eventual
commit prepared comes after a restart,
then there is no way of knowing if the original transaction was
skipped or not and we can't know if the commit prepared
needs to be sent. I tried not skipping the "stream prepare", but that
causes a crash in the apply worker
as it tries to find the non-existent streamed file. We could add logic
to silently ignore a spurious "stream prepare"
but that might not be ideal. Any thoughts on how to address this? Or
else, we will need to avoid skipping streamed
transactions as well.

regards,
Ajin Cherian
Fujitsu Australia

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Aleksander Alekseev 2022-03-02 13:44:19 Re: Changing "Hot Standby" to "hot standby"
Previous Message Aleksander Alekseev 2022-03-02 13:25:33 Re: Add 64-bit XIDs into PostgreSQL 15