Re: Time delayed LR (WAS Re: logical replication restrictions)

From: "Euler Taveira" <euler(at)eulerto(dot)com>
To: "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>, "Peter Smith" <smithpb2250(at)gmail(dot)com>
Cc: "vignesh C" <vignesh21(at)gmail(dot)com>, "Kyotaro Horiguchi" <horikyota(dot)ntt(at)gmail(dot)com>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, "shveta(dot)malik(at)gmail(dot)com" <shveta(dot)malik(at)gmail(dot)com>, "Dilip Kumar" <dilipbalaut(at)gmail(dot)com>, "Amit Kapila" <amit(dot)kapila16(at)gmail(dot)com>, "Melih Mutlu" <m(dot)melihmutlu(at)gmail(dot)com>, "Andres Freund" <andres(at)anarazel(dot)de>, "Marcos Pegoraro" <marcos(at)f10(dot)com(dot)br>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Time delayed LR (WAS Re: logical replication restrictions)
Date: 2023-01-23 23:32:08
Message-ID: ea766b46-c46c-494d-adc7-b6666e9424e9@app.fastmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jan 22, 2023, at 9:42 AM, Takamichi Osumi (Fujitsu) wrote:
> On Saturday, January 21, 2023 3:36 AM I wrote:
> > Kindly have a look at the patch v18.
> I've conducted some refactoring for v18.
> Now the latest patch should be tidier and
> the comments would be clearer and more aligned as a whole.
>
> Attached the updated patch v19.
[I haven't been following this thread for a long time...]

Good to know that you keep improving this patch. I have a few suggestions that
were easier to provide a patch on top of your latest patch than to provide an
inline suggestions.

There are a few documentation polishing. Let me comment some of them above.

- The length of time (ms) to delay the application of changes.
+ Total time spent delaying the application of changes, in milliseconds

I don't remember if I suggested this description for catalog but IMO the
suggestion reads better for me.

- For time-delayed logical replication (i.e. when the subscription is
- created with parameter min_apply_delay > 0), the apply worker sends a
- Standby Status Update message to the publisher with a period of
- <literal>wal_receiver_status_interval</literal>. Make sure to set
- <literal>wal_receiver_status_interval</literal> less than the
- <literal>wal_sender_timeout</literal> on the publisher, otherwise, the
- walsender will repeatedly terminate due to the timeout errors. If
- <literal>wal_receiver_status_interval</literal> is set to zero, the apply
- worker doesn't send any feedback messages during the subscriber's
- <literal>min_apply_delay</literal> period. See
- <xref linkend="sql-createsubscription"/> for details.
+ For time-delayed logical replication, the apply worker sends a feedback
+ message to the publisher every
+ <varname>wal_receiver_status_interval</varname> milliseconds. Make sure
+ to set <varname>wal_receiver_status_interval</varname> less than the
+ <varname>wal_sender_timeout</varname> on the publisher, otherwise, the
+ <literal>walsender</literal> will repeatedly terminate due to timeout
+ error. If <varname>wal_receiver_status_interval</varname> is set to
+ zero, the apply worker doesn't send any feedback messages during the
+ <literal>min_apply_delay</literal> interval.

I removed the parenthesis explanation about time-delayed logical replication.
If you are reading the documentation and does not know what it means you should
(a) read the logical replication chapter or (b) check the glossary (maybe a new
entry should be added). I also removed the Standby status Update message but it
is a low level detail; let's refer to it as feedback message as the other
sentences do. I changed "literal" to "varname" that's the correct tag for
parameters. I replace "period" with "interval" that was the previous
terminology. IMO we should be uniform, use one or the other.

- The subscriber replication can be instructed to lag behind the publisher
- side changes by specifying the <literal>min_apply_delay</literal>
- subscription parameter. See <xref linkend="sql-createsubscription"/> for
- details.
+ A logical replication subscription can delay the application of changes by
+ specifying the <literal>min_apply_delay</literal> subscription parameter.
+ See <xref linkend="sql-createsubscription"/> for details.

This feature refers to a specific subscription, hence, "logical replication
subscription" instead of "subscriber replication".

+ if (IsSet(opts->specified_opts, SUBOPT_MIN_APPLY_DELAY))
+ errorConflictingDefElem(defel, pstate);
+

Peter S referred to this missing piece of code too.

-int
+static int
defGetMinApplyDelay(DefElem *def)
{

It seems you forgot static keyword.

- elog(DEBUG2, "time-delayed replication for txid %u, min_apply_delay = %lld ms, Remaining wait time: %ld ms",
- xid, (long long) MySubscription->minapplydelay, diffms);
+ elog(DEBUG2, "time-delayed replication for txid %u, min_apply_delay = " INT64_FORMAT " ms, remaining wait time: %ld ms",
+ xid, MySubscription->minapplydelay, diffms);

int64 should use format modifier INT64_FORMAT.

- (long) wal_receiver_status_interval * 1000,
+ wal_receiver_status_interval * 1000L,

Cast is not required. I added a suffix to the constant.

- elog(DEBUG2, "sending feedback (force %d) to recv %X/%X, write %X/%X, flush %X/%X in-delayed: %d",
+ elog(DEBUG2, "sending feedback (force %d) to recv %X/%X, write %X/%X, flush %X/%X, apply delay: %s",
force,
LSN_FORMAT_ARGS(recvpos),
LSN_FORMAT_ARGS(writepos),
LSN_FORMAT_ARGS(flushpos),
- in_delayed_apply);
+ in_delayed_apply? "yes" : "no");

It is better to use a string to represent the yes/no option.

- gettext_noop("Min apply delay (ms)"));
+ gettext_noop("Min apply delay"));

I don't know if it was discussed but we don't add units to headers. When I
think about this parameter representation (internal and external), I decided to
use the previous code because it provides a unit for external representation. I
understand that using the same representation as recovery_min_apply_delay is
good but the current code does not handle the external representation
accordingly. (recovery_min_apply_delay uses the GUC machinery to adds the unit
but for min_apply_delay, it doesn't).

# Setup for streaming case
-$node_publisher->append_conf('postgres.conf',
+$node_publisher->append_conf('postgresql.conf',
'logical_decoding_mode = immediate');
$node_publisher->reload;

Fix configuration file name.

Maybe tests should do a better job. I think check_apply_delay_time is fragile
because it does not guarantee that time is not shifted. Time-delayed
replication is a subscriber feature and to check its correctness it should
check the logs.

# Note that we cannot call check_apply_delay_log() here because there is a
# possibility that the delay is skipped. The event happens when the WAL
# replication between publisher and subscriber is delayed due to a mechanical
# problem. The log output will be checked later - substantial delay-time case.

If you might not use the logs for it, it should adjust the min_apply_delay, no?

It does not exercise the min_apply_delay vs parallel streaming mode.

+ /*
+ * The combination of parallel streaming mode and
+ * min_apply_delay is not allowed.
+ */
+ if (opts.streaming == LOGICALREP_STREAM_PARALLEL)
+ if ((IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && opts.min_apply_delay > 0) ||
+ (!IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && sub->minapplydelay > 0))
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot enable %s mode for subscription with %s",
+ "streaming = parallel", "min_apply_delay"));
+

Is this code correct? I also didn't like this message. "cannot enable streaming
= parallel mode for subscription with min_apply_delay" is far from a good error
message. How about refer parallelism to "parallel streaming mode".

--
Euler Taveira
EDB https://www.enterprisedb.com/

Attachment Content-Type Size
review-1.patch text/x-patch 49.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message James Coleman 2023-01-23 23:42:45 Re: Fix incorrect comment reference
Previous Message David Rowley 2023-01-23 23:31:59 Re: Improve LATERAL join case in test memoize.sql