Re: logical replication empty transactions

From: Ajin Cherian <itsajin(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Rahila Syed <rahila(dot)syed(at)2ndquadrant(dot)com>, Euler Taveira <euler(at)timbira(dot)com(dot)br>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Craig Ringer <craig(at)2ndquadrant(dot)com>
Subject: Re: logical replication empty transactions
Date: 2021-08-13 11:00:58
Message-ID: CAFPTHDa1ydj7ofc3xgvu6gvZWbvS=_GNpFqb20HygBBEJ6Zfxw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Aug 2, 2021 at 7:20 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Fri, Jul 23, 2021 at 3:39 PM Ajin Cherian <itsajin(at)gmail(dot)com> wrote:
> >
>
> Let's first split the patch for prepared and non-prepared cases as
> that will help to focus on each of them separately. BTW, why haven't
> you considered implementing point 1b as explained by Andres in his
> email [1]? I think we can send a keepalive message in case of
> synchronous replication when we skip an empty transaction, otherwise,
> it might delay in responding to transactions synchronous_commit mode.
> I think in the tests done in the thread, it might not have been shown
> because we are already sending keepalives too frequently. But what if
> someone disables wal_sender_timeout or kept it to a very large value?
> See WalSndKeepaliveIfNecessary. The other thing you might want to look
> at is if the reason for frequent keepalives is the same as described
> in the email [2].
>

I have tried to address the comment here by modifying the
ctx->update_progress callback function (WalSndUpdateProgress) provided
for plugins. I have added an option
by which the callback can specify if it wants to send keep_alives. And
when the callback is called with that option set, walsender updates a
flag force_keep_alive_syncrep.
The Walsender in the WalSndWaitForWal for loop, checks this flag and
if synchronous replication is enabled, then sends a keep alive.
Currently this logic
is added as an else to the current logic that is already there in
WalSndWaitForWal, which is probably considered unnecessary and a
source of the keep alive flood
that you talked about. So, I can change that according to how that fix
shapes up there. I have also added an extern function in syncrep.c
that makes it possible
for walsender to query if synchronous replication is turned on.

The reason I had to turn on a flag and rely on the WalSndWaitForWal to
send the keep alive in its next iteration is because I tried doing
this directly when a
commit is skipped but it didn't work. The reason for this is that when
the commit is being decoded the sentptr at the moment is at the commit
LSN and the keep alive
will be sent for the commit LSN but the syncrep wait is waiting for
end_lsn of the transaction which is the next LSN. So, sending a keep
alive at the moment the
commit is decoded doesn't seem to solve the problem of the waiting
synchronous reply.

> Few other miscellaneous comments:
> 1.
> static void
> pgoutput_commit_prepared_txn(LogicalDecodingContext *ctx,
> ReorderBufferTXN *txn,
> - XLogRecPtr commit_lsn)
> + XLogRecPtr commit_lsn, XLogRecPtr prepare_end_lsn,
> + TimestampTz prepare_time)
> {
> + PGOutputTxnData *txndata = (PGOutputTxnData *) txn->output_plugin_private;
> +
> OutputPluginUpdateProgress(ctx);
>
> + /*
> + * If the BEGIN PREPARE was not yet sent, then it means there were no
> + * relevant changes encountered, so we can skip the COMMIT PREPARED
> + * message too.
> + */
> + if (txndata)
> + {
> + bool skip = !txndata->sent_begin_txn;
> + pfree(txndata);
> + txn->output_plugin_private = NULL;
>
> How is this supposed to work after the restart when prepared is sent
> before the restart and we are just sending commit_prepared after
> restart? Won't this lead to sending commit_prepared even when the
> corresponding prepare is not sent? Can we think of a better way to
> deal with this?
>

I have tried to resolve this by adding logic in worker,c to silently
ignore spurious commit_prepareds. But this change required checking if
the prepare exists on the
subscriber before attempting the commit_prepared but the current API
that checks this requires prepare time and transaction end_lsn. But
for this I had to
change the protocol of commit_prepared, and I understand that this
would break backward compatibility between subscriber and publisher
(you have raised this issue as well).
I am not sure how else to handle this, let me know if you have any
other ideas. One option could be to have another API to check if the
prepare exists on the subscriber with
the prepared 'gid' alone, without checking prepare_time or end_lsn.
Let me know if this idea works.

I have left out the patch 0002 for prepared transactions until we
arrive at a decision on how to address the above issue.

Peter,
I have also addressed the comments you've raised on patch 0001, please
have a look and confirm.

Regards,
Ajin Cherian
Fujitsu Australia.

Attachment Content-Type Size
v13-0001-Skip-empty-transactions-for-logical-replication.patch application/octet-stream 15.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ranier Vilela 2021-08-13 11:12:59 Re: Bug in huge simplehash
Previous Message Ram Charan Kallem 2021-08-13 10:20:41 RE: Multiple Postgres process are running in background