RE: Time delayed LR (WAS Re: logical replication restrictions)

From: "Takamichi Osumi (Fujitsu)" <osumi(dot)takamichi(at)fujitsu(dot)com>
To: 'Kyotaro Horiguchi' <horikyota(dot)ntt(at)gmail(dot)com>
Cc: "amit(dot)kapila16(at)gmail(dot)com" <amit(dot)kapila16(at)gmail(dot)com>, "smithpb2250(at)gmail(dot)com" <smithpb2250(at)gmail(dot)com>, "vignesh21(at)gmail(dot)com" <vignesh21(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, "shveta(dot)malik(at)gmail(dot)com" <shveta(dot)malik(at)gmail(dot)com>, "dilipbalaut(at)gmail(dot)com" <dilipbalaut(at)gmail(dot)com>, "euler(at)eulerto(dot)com" <euler(at)eulerto(dot)com>, "m(dot)melihmutlu(at)gmail(dot)com" <m(dot)melihmutlu(at)gmail(dot)com>, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>, "marcos(at)f10(dot)com(dot)br" <marcos(at)f10(dot)com(dot)br>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: Time delayed LR (WAS Re: logical replication restrictions)
Date: 2023-01-25 14:23:51
Message-ID: TYCPR01MB837305BD31FA317256BC7B1FEDCE9@TYCPR01MB8373.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wednesday, January 25, 2023 3:27 PM Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> wrote:
> At Tue, 24 Jan 2023 12:19:04 +0000, "Takamichi Osumi (Fujitsu)"
> <osumi(dot)takamichi(at)fujitsu(dot)com> wrote in
> > Attached the patch v20 that has incorporated all comments so far.
>
> Thanks! I looked thourgh the documentation part.
Thank you for your review !

> + <entry role="catalog_table_entry"><para role="column_definition">
> + <structfield>subminapplydelay</structfield> <type>int8</type>
> + </para>
> + <para>
> + Total time spent delaying the application of changes, in milliseconds.
> + </para></entry>
>
> I was confused becase it reads as this column shows the summarized actual
> waiting time caused by min_apply_delay. IIUC actually it shows the
> min_apply_delay setting for the subscription. Thus shouldn't it be something
> like this?
>
> "The minimum amount of time to delay applying changes, in milliseconds"
> And it might be better to mention the corresponding subscription paramter.
This description looks much better to me than the past description. Fixed.
OTOH, other parameters don't mention about its subscription parameters.
So, I didn't add the mention.

> + error. If <varname>wal_receiver_status_interval</varname> is set to
> + zero, the apply worker doesn't send any feedback messages during
> the
> + <literal>min_apply_delay</literal> period.
>
> I took a bit longer time to understand what this sentence means. I'd like to
> suggest something like the follwoing.
>
> "Since no status-update messages are sent while delaying, note that
> wal_receiver_status_interval is the only source of keepalive messages during
> that period."
The current patch's description is precise and I prefer that.
I would say "the only source" would be confusing to readers.
However, I slightly adjusted the description a bit. Could you please check ?

> + <para>
> + A logical replication subscription can delay the application of changes by
> + specifying the <literal>min_apply_delay</literal> subscription
> parameter.
> + See <xref linkend="sql-createsubscription"/> for details.
> + </para>
>
> I'm not sure "logical replication subscription" is a common term.
> Doesn't just "subscription" mean the same, especially in that context?
> (Note that 31.2 starts with "A subscription is the downstream..").
I think you are right. Fixed.

> + Any delay occurs only on WAL records for transaction begins after
> all
> + initial table synchronization has finished. The delay is
> + calculated
>
> There is no "transaction begin" WAL records. Maybe it is "logical replication
> transaction begin message". The timestamp is of "commit time". (I took
> "transaction begins" as a noun, but that might be
> wrong..)
Yeah, we can improve here. But, we need to include not only
"commit" but also "prepare" as nuance in this part.

In short, I think we should change here to mention
(1) the delay happens after all initial table synchronization
(2) how delay is applied for non-streaming and streaming transactions in general.

By the way, WAL timestamp is a word used in the recovery_min_apply_delay.
So, I'd like to keep it to make the description more aligned with it,
until there is a better description.

Updated the doc. I adjusted the commit message according to this fix.
>
> + may reduce the actual wait time. It is also possible that the overhead
> + already exceeds the requested <literal>min_apply_delay</literal>
> value,
> + in which case no additional wait is necessary. If the system
> + clocks
>
> I'm not sure it is right to say "necessary" here. IMHO it might be better be "in
> which case no delay is applied".
Agreed. Fixed.

> + in which case no additional wait is necessary. If the system clocks
> + on publisher and subscriber are not synchronized, this may lead to
> + apply changes earlier than expected, but this is not a major issue
> + because this parameter is typically much larger than the time
> + deviations between servers. Note that if this parameter is
> + set to a
>
> This doesn't seem to fit our documentation. It is not our business whether a
> certain amount deviation is critical or not. How about somethig like the
> following?
>
> "Note that the delay is measured between the timestamp assigned by
> publisher and the system clock on subscriber. You need to manage the
> system clocks to be in sync so that the delay works properly."
As discussed, this is aligned with recovery_min_apply_delay. So, I keep it.

> + Delaying the replication can mean there is a much longer time
> + between making a change on the publisher, and that change
> being
> + committed on the subscriber. This can impact the performance
> of
> + synchronous replication. See <xref
> linkend="guc-synchronous-commit"/>
> + parameter.
>
> Do we need the "can" in "Delaying the replication can mean"? If we want to
> say, it might be "Delaying the replication means there can be a much longer..."?
The "can" indicates the possibility as the nuance,
while adopting "means" in this case indicates "time delayed LR causes
the long time wait always".

I'm okay with either expression, but
I think you are right in practice and from
the perspective of the purpose of this feature. So, fixed.
> + <para>
> + Create a subscription to a remote server that replicates tables in
> + the <literal>mypub</literal> publication and starts replicating
> immediately
> + on commit. Pre-existing data is not copied. The application of changes is
> + delayed by 4 hours.
> +<programlisting>
> +CREATE SUBSCRIPTION mysub
> + CONNECTION 'host=192.0.2.4 port=5432 user=foo dbname=foodb'
> + PUBLICATION mypub
> + WITH (copy_data = false, min_apply_delay = '4h');
> +</programlisting></para>
>
> I'm not sure we need this additional example. We already have two exmaples
> one of which differs from the above only by actual values for PUBLICATION and
> WITH clauses.
I thought there was no harm in having this example, but
what you say makes sense. Removed.

Attached the updated v22.

Best Regards,
Takamichi Osumi

Attachment Content-Type Size
v22-0001-Time-delayed-logical-replication-subscriber.patch application/octet-stream 81.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message houzj.fnst@fujitsu.com 2023-01-25 14:24:50 RE: Perform streaming logical transactions by background workers and parallel apply
Previous Message Pavel Stehule 2023-01-25 14:21:26 Re: Re: Support plpgsql multi-range in conditional control