RE: Time delayed LR (WAS Re: logical replication restrictions)

From: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
To: 'Masahiko Sawada' <sawada(dot)mshk(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>, "vignesh21(at)gmail(dot)com" <vignesh21(at)gmail(dot)com>, "shveta(dot)malik(at)gmail(dot)com" <shveta(dot)malik(at)gmail(dot)com>, "Takamichi Osumi (Fujitsu)" <osumi(dot)takamichi(at)fujitsu(dot)com>, "dilipbalaut(at)gmail(dot)com" <dilipbalaut(at)gmail(dot)com>, "euler(at)eulerto(dot)com" <euler(at)eulerto(dot)com>, "m(dot)melihmutlu(at)gmail(dot)com" <m(dot)melihmutlu(at)gmail(dot)com>, "marcos(at)f10(dot)com(dot)br" <marcos(at)f10(dot)com(dot)br>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: Time delayed LR (WAS Re: logical replication restrictions)
Date: 2023-02-27 08:51:29
Message-ID: TYCPR01MB5870CCDE096B4A41F53087A8F5AF9@TYCPR01MB5870.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear Sawada-san, Amit,

Thank you for reviewing!

> + *
> + * LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM is the minimum
> protocol version
> + * with support for delaying to send transactions. Introduced in PG16.
> */
> #define LOGICALREP_PROTO_MIN_VERSION_NUM 1
> #define LOGICALREP_PROTO_VERSION_NUM 1
> #define LOGICALREP_PROTO_STREAM_VERSION_NUM 2
> #define LOGICALREP_PROTO_TWOPHASE_VERSION_NUM 3
> #define LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM 4
> -#define LOGICALREP_PROTO_MAX_VERSION_NUM
> LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM
> +#define LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM 4
> +#define LOGICALREP_PROTO_MAX_VERSION_NUM
> LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM
>
> What is the usecase of the old macro,
> LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM, after adding
> LOGICALREP_PROTO_MIN_SEND_DELAY_VERSION_NUM ? I think if we go this
> way, we will end up adding macros every time when adding a new option,
> which seems not a good idea. I'm really not sure we need to change the
> protocol version or the macro. Commit
> 366283961ac0ed6d89014444c6090f3fd02fce0a introduced the 'origin'
> subscription parameter that is also sent to the publisher, but we
> didn't touch the protocol version at all.

I removed the protocol number.

I checked the previous discussion[1]. According to it, the protocol version must
be modified when new message is added or exiting messages are changed.
This patch intentionally make walsenders delay sending data, and at that time no
extra information is added. Therefore I think it is not needed.

> ---
> Why do we not to delay sending COMMIT PREPARED messages?

This is motivated by the comment[2] but I preferred your opinion[3].
Now COMMIT PREPARED is delayed instead of PREPARE message.

> ---
> + /*
> + * If we've requested to shut down, exit the process.
> + *
> + * Note that WalSndDone() cannot be used here because
> the delaying
> + * changes will be sent in the function.
> + */
> + if (got_STOPPING)
> + WalSndShutdown();
>
> Since the walsender exits without sending the done message at a server
> shutdown, we get the following log message on the subscriber:
>
> ERROR: could not receive data from WAL stream: server closed the
> connection unexpectedly
>
> I think that since the walsender is just waiting for sending data, it
> can send the done message if the socket is writable.

You are right. I was confused with the previous implementation that workers cannot
accept any messages. I make walsenders send the end-command message directly.
Is it what you expeced?

> ---
> + delayUntil = TimestampTzPlusMilliseconds(delay_start,
> ctx->min_send_delay);
> + remaining_wait_time_ms =
> TimestampDifferenceMilliseconds(GetCurrentTimestamp(), delayUntil);
> +
> (snip)
> +
> + /* Sleep until appropriate time. */
> + timeout_sleeptime_ms =
> WalSndComputeSleeptime(GetCurrentTimestamp());
>
> I think it's better to call GetCurrentTimestamp() only once.

Right, fixed.

> ---
> +# This test is successful only if at least the configured delay has elapsed.
> +ok( time() - $publisher_insert_time >= $delay,
> + "subscriber applies WAL only after replication delay for
> non-streaming transaction"
> +);
>
> The subscriber doesn't actually apply WAL records, but logically
> replicated changes. How about "subscriber applies changes only
> after..."?

I grepped other tests, and I could not find the same usage of the word "WAL".
So fixed as you said.

In next version I will use grammar checker like Chat-GPT to modify commit messages...

[1]: https://www.postgresql.org/message-id/CAA4eK1LjOm6-OHggYVH35dQ_v40jOXrJW0GFy3kuwTd2J48%3DUg%40mail.gmail.com
[2]: https://www.postgresql.org/message-id/CAA4eK1K4uPbudrNdH%2B%3D_vN-Hpe9wYh%3D3vBS5Ww9dHn-LOWMV0g%40mail.gmail.com
[3]: https://www.postgresql.org/message-id/CAD21AoA0mPq_m6USfAC8DAkvFfwjqGvGq++Uv=avryYotvq98A@mail.gmail.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

Attachment Content-Type Size
v8-0001-Time-delayed-logical-replication-on-publisher-sid.patch application/octet-stream 80.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2023-02-27 08:53:08 Re: Provide PID data for "cannot wait on a latch owned by another process" in latch.c
Previous Message Kyotaro Horiguchi 2023-02-27 08:48:10 Re: Provide PID data for "cannot wait on a latch owned by another process" in latch.c