Re: Time delayed LR (WAS Re: logical replication restrictions)

From: vignesh C <vignesh21(at)gmail(dot)com>
To: "Takamichi Osumi (Fujitsu)" <osumi(dot)takamichi(at)fujitsu(dot)com>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, "shveta(dot)malik(at)gmail(dot)com" <shveta(dot)malik(at)gmail(dot)com>, "dilipbalaut(at)gmail(dot)com" <dilipbalaut(at)gmail(dot)com>, "amit(dot)kapila16(at)gmail(dot)com" <amit(dot)kapila16(at)gmail(dot)com>, "euler(at)eulerto(dot)com" <euler(at)eulerto(dot)com>, "m(dot)melihmutlu(at)gmail(dot)com" <m(dot)melihmutlu(at)gmail(dot)com>, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>, "marcos(at)f10(dot)com(dot)br" <marcos(at)f10(dot)com(dot)br>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, "smithpb2250(at)gmail(dot)com" <smithpb2250(at)gmail(dot)com>
Subject: Re: Time delayed LR (WAS Re: logical replication restrictions)
Date: 2023-01-14 06:27:29
Message-ID: CALDaNm2d=+WBGye015DoACY_PzrRKdmwjeZwDa+Y9LXFqnT0Ow@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 12 Jan 2023 at 21:09, Takamichi Osumi (Fujitsu)
<osumi(dot)takamichi(at)fujitsu(dot)com> wrote:
>
> On Thursday, January 12, 2023 12:04 PM Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> wrote:
> > At Wed, 11 Jan 2023 12:46:24 +0000, "Hayato Kuroda (Fujitsu)"
> > <kuroda(dot)hayato(at)fujitsu(dot)com> wrote in
> > > them. Which version is better?
> >
> >
> > Some comments by a quick loock, different from the above.
> Horiguchi-san, thanks for your review !
>
>
> > + CONNECTION 'host=192.168.1.50 port=5432 user=foo
> > dbname=foodb'
> >
> > I understand that we (not PG people, but IT people) are supposed to use in
> > documents a certain set of special addresses that is guaranteed not to be
> > routed in the field.
> >
> > > TEST-NET-1 : 192.0.2.0/24
> > > TEST-NET-2 : 198.51.100.0/24
> > > TEST-NET-3 : 203.0.113.0/24
> >
> > (I found 192.83.123.89 in the postgres_fdw doc, but it'd be another issue..)
> Fixed. If necessary we can create another thread for this.
>
> > + if (strspn(tmp, "-0123456789 ") == strlen(tmp))
> >
> > Do we need to bother spending another memory block for apparent non-digits
> > here?
> Yes. The characters are necessary to handle an issue reported in [1].
> The issue happened if the user inputs a negative value,
> then the length comparison became different between strspn and strlen
> and the input value was recognized as seconds, when
> the unit wasn't described. This led to a wrong error message for the user.
>
> Those addition of such characters solve the issue.
>
> > + errmsg(INT64_FORMAT " ms
> > is outside the valid range for parameter
> > +\"%s\"",
> >
> > We don't use INT64_FORMAT in translatable message strings. Cast then
> > use %lld instead.
> Thanks for teaching us. Fixed.
>
> > This message looks unfriendly as it doesn't suggest the valid range, and it
> > shows the input value in a different unit from what was in the input. A I think we
> > can spell it as "\"%s\" is outside the valid range for subsciription parameter
> > \"%s\" (0 .. <INT_MAX> in millisecond)"
> Makes sense. I incorporated the valid range with the aligned format of recovery_min_apply_delay.
> FYI, the physical replication's GUC doesn't write the unites for the range like below.
> I followed and applied this style.
>
> ---
> LOG: -1 ms is outside the valid range for parameter "recovery_min_apply_delay" (0 .. 2147483647)
> FATAL: configuration file "/home/k5user/new/pg/l/make_v15/slave/postgresql.conf" contains errors
> ---
>
> > + int64 min_apply_delay;
> > ..
> > + if (ms < 0 || ms > INT_MAX)
> >
> > Why is the variable wider than required?
> You are right. Fixed.
>
> > + errmsg("%s and %s are mutually
> > exclusive options",
> > + "min_apply_delay > 0",
> > "streaming = parallel"));
> >
> > Mmm. Couldn't we refuse 0 as min_apply_delay?
> Sorry, the previous patch's behavior wasn't consistent with this error message.
>
> In the previous patch, if we conducted alter subscription
> with stream = parallel and min_apply_delay = 0 (from a positive value) at the same time,
> the alter command failed, although this should succeed by this time-delayed feature specification.
> We fixed this part accordingly by some more tests in AlterSubscription().
>
> By the way, we should allow users to change min_apply_dealy to 0
> whenever they want from different value. Then, we didn't restrict
> this kind of operation.
>
> > + sub->minapplydelay > 0)
> > ...
> > + if (opts.min_apply_delay > 0 &&
> >
> > Is there any reason for the differenciation?
> Yes. The former is the object for an existing subscription configuration.
> For example, if we alter subscription with setting streaming = 'parallel'
> for a subscription created with min_apply_delay = '1 day', we
> need to reject the alter command. The latter is new settings.
>
>
> > +
> > errmsg("cannot set %s for subscription with %s",
> > +
> > "streaming = parallel", "min_apply_delay > 0"));
> >
> > I think that this shoud be more like human-speking. Say, "cannot enable
> > min_apply_delay for subscription in parallel streaming mode" or something..
> > The same is applicable to the nearby message.
> Reworded the error messages. Please check.
>
> > +static void maybe_delay_apply(TimestampTz ts);
> >
> > apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,
> > - XLogRecPtr lsn)
> > + XLogRecPtr lsn, TimestampTz ts)
> >
> > "ts" looks too generic. Couldn't it be more specific?
> > We need a explanation for the parameter in the function comment.
> Changed it to finish_ts, since it indicates commit/prepare time.
> This terminology should be aligned with finish lsn.
>
> > + if (!am_parallel_apply_worker())
> > + {
> > + Assert(ts > 0);
> > + maybe_delay_apply(ts);
> >
> > It seems to me better that the if condition and assertion are checked inside
> > maybe_delay_apply().
> Fixed.
>

Thanks for the updated patch, Few comments:
1) Since the min_apply_delay = 3, but you have specified 2s, there
might be a possibility that it can log delay as 1000ms due to
pub/sub/network delay and the test can fail randomly, If we cannot
ensure this log file value, check_apply_delay_time verification alone
should be sufficient.
+is($result, qq(5|1|5), 'check if the new rows were applied to subscriber');
+
+check_apply_delay_log("logical replication apply delay", "2000");

2) I'm not sure if this will add any extra coverage as the altering
value of min_apply_delay is already tested in the regression, if so
this test can be removed:
+# Test ALTER SUBSCRIPTION. Delay 86460 seconds (1 day 1 minute).
+$node_subscriber->safe_psql('postgres',
+ "ALTER SUBSCRIPTION tap_sub SET (min_apply_delay = 86460000)"
+);
+
+# New row to trigger apply delay.
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_tab VALUES (0, 'foobar')");
+
+check_apply_delay_log("logical replication apply delay", "80000000");

3) We generally keep the subroutines before the tests, it can be kept
accordingly:
3.a)
+sub check_apply_delay_log
+{
+ my ($message, $expected) = @_;
+ $expected = 0 unless defined $expected;
+
+ my $old_log_location = $log_location;

3.b)
+sub check_apply_delay_time
+{
+ my ($primary_key, $expected_diffs) = @_;
+
+ my $inserted_time_on_pub = $node_publisher->safe_psql('postgres', qq[
+ SELECT extract(epoch from c) FROM test_tab WHERE a =
$primary_key;
+ ]);
+

4) typo "more then once" should be "more than once"
+ regress_testsub | regress_subscription_user | f |
{testpub,testpub1,testpub2} | f | off | d |
f | any | 0 | off
| dbname=regress_doesnotexist | 0/0
(1 row)

-- fail - publication used more then once
@@ -316,10 +316,10 @@ ERROR: publication "testpub3" is not in
subscription "regress_testsub"
-- ok - delete publications
ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1,
testpub2 WITH (refresh = false);
\dRs+

5) This can be changed to "Is it larger than expected?"
+ # Is it larger than expected ?
+ cmp_ok($logged_delay, '>', $expected,
+ "The wait time of the apply worker is long
enough expectedly"
+ );

6) 2022 should be changed to 2023
+++ b/src/test/subscription/t/032_apply_delay.pl
@@ -0,0 +1,170 @@
+
+# Copyright (c) 2022, PostgreSQL Global Development Group
+
+# Test replication apply delay

7) Termination full stop is not required for single line comments:
7.a)
+use Test::More;
+
+# Create publisher node.
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');

7.b) +
+# Create subscriber node.
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');

7.c) +
+# Create some preexisting content on publisher.
+$node_publisher->safe_psql('postgres',

7.d) similarly in rest of the files

8) Is it possible to add one test for spooling also?

Regards,
Vignesh

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2023-01-14 06:34:33 Re: backup.sgml typo
Previous Message Amit Kapila 2023-01-14 06:01:23 Re: Add BufFileRead variants with short read and EOF detection