RE: Time delayed LR (WAS Re: logical replication restrictions)

From: "Takamichi Osumi (Fujitsu)" <osumi(dot)takamichi(at)fujitsu(dot)com>
To: 'vignesh C' <vignesh21(at)gmail(dot)com>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, "shveta(dot)malik(at)gmail(dot)com" <shveta(dot)malik(at)gmail(dot)com>, "dilipbalaut(at)gmail(dot)com" <dilipbalaut(at)gmail(dot)com>, "amit(dot)kapila16(at)gmail(dot)com" <amit(dot)kapila16(at)gmail(dot)com>, "euler(at)eulerto(dot)com" <euler(at)eulerto(dot)com>, "m(dot)melihmutlu(at)gmail(dot)com" <m(dot)melihmutlu(at)gmail(dot)com>, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>, "marcos(at)f10(dot)com(dot)br" <marcos(at)f10(dot)com(dot)br>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, "smithpb2250(at)gmail(dot)com" <smithpb2250(at)gmail(dot)com>
Subject: RE: Time delayed LR (WAS Re: logical replication restrictions)
Date: 2023-01-17 11:00:11
Message-ID: TYCPR01MB8373F3E780D5BB60E441ECF3EDC69@TYCPR01MB8373.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Saturday, January 14, 2023 3:27 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> 1) Since the min_apply_delay = 3, but you have specified 2s, there might be a
> possibility that it can log delay as 1000ms due to pub/sub/network delay and
> the test can fail randomly, If we cannot ensure this log file value,
> check_apply_delay_time verification alone should be sufficient.
> +is($result, qq(5|1|5), 'check if the new rows were applied to
> +subscriber');
> +
> +check_apply_delay_log("logical replication apply delay", "2000");
You are right. Removed the left-time check of the 1st call of check_apply_delay_log().

> 2) I'm not sure if this will add any extra coverage as the altering value of
> min_apply_delay is already tested in the regression, if so this test can be
> removed:
> +# Test ALTER SUBSCRIPTION. Delay 86460 seconds (1 day 1 minute).
> +$node_subscriber->safe_psql('postgres',
> + "ALTER SUBSCRIPTION tap_sub SET (min_apply_delay =
> 86460000)"
> +);
> +
> +# New row to trigger apply delay.
> +$node_publisher->safe_psql('postgres',
> + "INSERT INTO test_tab VALUES (0, 'foobar')");
> +
> +check_apply_delay_log("logical replication apply delay", "80000000");
While addressing this point, I've noticed that there is a
behavior difference between physical replication's recovery_min_apply_delay
and this feature when stopping the replication during delays.

At present, in the latter case,
the apply worker exits without applying the suspended transaction
after ALTER SUBSCRIPTION DISABLE command for the subscription.
Meanwhile, there is no "disabling" command for physical replication,
but I checked the behavior about what happens for promoting a secondary
during the delay of recovery_min_apply_delay for physical replication as one example.
The transaction has become visible even in the promoting in the middle of delay.

I'm not sure if I should make the time-delayed LR aligned with this behavior.
Does someone has an opinion for this ?

By the way, the above test code can be used for the test case
when the apply worker is in a delay but the transaction has been canceled by
ALTER SUBSCRIPTION DISABLE command. So, I didn't remove it at this stage.
> 3) We generally keep the subroutines before the tests, it can be kept
> accordingly:
> 3.a)
> +sub check_apply_delay_log
> +{
> + my ($message, $expected) = @_;
> + $expected = 0 unless defined $expected;
> +
> + my $old_log_location = $log_location;
>
> 3.b)
> +sub check_apply_delay_time
> +{
> + my ($primary_key, $expected_diffs) = @_;
> +
> + my $inserted_time_on_pub = $node_publisher->safe_psql('postgres',
> qq[
> + SELECT extract(epoch from c) FROM test_tab WHERE a =
> $primary_key;
> + ]);
> +
Fixed.


> 4) typo "more then once" should be "more than once"
> + regress_testsub | regress_subscription_user | f |
> {testpub,testpub1,testpub2} | f | off | d |
> f | any | 0 | off
> | dbname=regress_doesnotexist | 0/0
> (1 row)
>
> -- fail - publication used more then once @@ -316,10 +316,10 @@ ERROR:
> publication "testpub3" is not in subscription "regress_testsub"
> -- ok - delete publications
> ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1,
> testpub2 WITH (refresh = false);
> \dRs+
This was an existing typo on HEAD. Addressed in other thread in [1].


> 5) This can be changed to "Is it larger than expected?"
> + # Is it larger than expected ?
> + cmp_ok($logged_delay, '>', $expected,
> + "The wait time of the apply worker is long
> enough expectedly"
> + );
Fixed.

> 6) 2022 should be changed to 2023
> +++ b/src/test/subscription/t/032_apply_delay.pl
> @@ -0,0 +1,170 @@
> +
> +# Copyright (c) 2022, PostgreSQL Global Development Group
> +
> +# Test replication apply delay
Fixed.

> 7) Termination full stop is not required for single line comments:
> 7.a)
> +use Test::More;
> +
> +# Create publisher node.
> +my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
>
> 7.b) +
> +# Create subscriber node.
> +my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
>
> 7.c) +
> +# Create some preexisting content on publisher.
> +$node_publisher->safe_psql('postgres',
>
> 7.d) similarly in rest of the files
Removed the periods for single line comments.

> 8) Is it possible to add one test for spooling also?
There is a streaming transaction case in the TAP test already.

I conducted some minor comment modifications along with above changes.
Kindly have a look at the v16.

[1] - https://www.postgresql.org/message-id/flat/TYCPR01MB83737EA140C79B7D099F65E8EDC69%40TYCPR01MB8373.jpnprd01.prod.outlook.com

Best Regards,
Takamichi Osumi

Attachment Content-Type Size
v16-0001-Time-delayed-logical-replication-subscriber.patch application/octet-stream 80.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2023-01-17 11:05:59 Re: [PoC] Improve dead tuple storage for lazy vacuum
Previous Message Lukas Fittl 2023-01-17 10:50:40 Re: Sampling-based timing for EXPLAIN ANALYZE