Re: Time delayed LR (WAS Re: logical replication restrictions)

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
Cc: "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>, "vignesh21(at)gmail(dot)com" <vignesh21(at)gmail(dot)com>, "shveta(dot)malik(at)gmail(dot)com" <shveta(dot)malik(at)gmail(dot)com>, "Takamichi Osumi (Fujitsu)" <osumi(dot)takamichi(at)fujitsu(dot)com>, "dilipbalaut(at)gmail(dot)com" <dilipbalaut(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "euler(at)eulerto(dot)com" <euler(at)eulerto(dot)com>, "m(dot)melihmutlu(at)gmail(dot)com" <m(dot)melihmutlu(at)gmail(dot)com>, "marcos(at)f10(dot)com(dot)br" <marcos(at)f10(dot)com(dot)br>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Time delayed LR (WAS Re: logical replication restrictions)
Date: 2023-02-27 05:40:36
Message-ID: CAD21AoCpf184dPD_n7F6FCn99siCNsiK3Viiu2K6gUzig+wu7Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Feb 23, 2023 at 9:10 PM Hayato Kuroda (Fujitsu)
<kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
>
> Dear Shi,
>
> Thank you for reviewing! PSA new version.
>
> > + elog(DEBUG2, "time-delayed replication for txid %u, delay_time
> > = %d ms, remaining wait time: %ld ms",
> > + ctx->write_xid, (int) ctx->min_send_delay,
> > + remaining_wait_time_ms);
> >
> > I tried and saw that the xid here looks wrong, what it got is the xid of the
> > previous transaction. It seems `ctx->write_xid` has not been updated and we
> > can't use it.
> >
>
> Good catch. There are several approaches to fix that, I choose the simplest way.
> TransactionId was added as an argument of functions.
>

Thank you for updating the patch. Here are some comments on v7 patch:

+ *
+ * LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM is the minimum protocol version
+ * with support for delaying to send transactions. Introduced in PG16.
*/
#define LOGICALREP_PROTO_MIN_VERSION_NUM 1
#define LOGICALREP_PROTO_VERSION_NUM 1
#define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
#define LOGICALREP_PROTO_TWOPHASE_VERSION_NUM 3
#define LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM 4
-#define LOGICALREP_PROTO_MAX_VERSION_NUM
LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM
+#define LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM 4
+#define LOGICALREP_PROTO_MAX_VERSION_NUM
LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM

What is the usecase of the old macro,
LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM, after adding
LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM ? I think if we go this
way, we will end up adding macros every time when adding a new option,
which seems not a good idea. I'm really not sure we need to change the
protocol version or the macro. Commit
366283961ac0ed6d89014444c6090f3fd02fce0a introduced the 'origin'
subscription parameter that is also sent to the publisher, but we
didn't touch the protocol version at all.

---
Why do we not to delay sending COMMIT PREPARED messages?

---
+ /*
+ * If we've requested to shut down, exit the process.
+ *
+ * Note that WalSndDone() cannot be used here because
the delaying
+ * changes will be sent in the function.
+ */
+ if (got_STOPPING)
+ WalSndShutdown();

Since the walsender exits without sending the done message at a server
shutdown, we get the following log message on the subscriber:

ERROR: could not receive data from WAL stream: server closed the
connection unexpectedly

I think that since the walsender is just waiting for sending data, it
can send the done message if the socket is writable.

---
+ delayUntil = TimestampTzPlusMilliseconds(delay_start,
ctx->min_send_delay);
+ remaining_wait_time_ms =
TimestampDifferenceMilliseconds(GetCurrentTimestamp(), delayUntil);
+
(snip)
+
+ /* Sleep until appropriate time. */
+ timeout_sleeptime_ms =
WalSndComputeSleeptime(GetCurrentTimestamp());

I think it's better to call GetCurrentTimestamp() only once.

---
+# This test is successful only if at least the configured delay has elapsed.
+ok( time() - $publisher_insert_time >= $delay,
+ "subscriber applies WAL only after replication delay for
non-streaming transaction"
+);

The subscriber doesn't actually apply WAL records, but logically
replicated changes. How about "subscriber applies changes only
after..."?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message torikoshia 2023-02-27 05:58:29 Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)
Previous Message Amit Kapila 2023-02-27 05:27:16 Re: Change xl_hash_vacuum_one_page.ntuples from int to uint16