From: | "Takamichi Osumi (Fujitsu)" <osumi(dot)takamichi(at)fujitsu(dot)com> |
---|---|
To: | 'Amit Kapila' <amit(dot)kapila16(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com> |
Cc: | "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "vignesh21(at)gmail(dot)com" <vignesh21(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, "shveta(dot)malik(at)gmail(dot)com" <shveta(dot)malik(at)gmail(dot)com>, "dilipbalaut(at)gmail(dot)com" <dilipbalaut(at)gmail(dot)com>, "euler(at)eulerto(dot)com" <euler(at)eulerto(dot)com>, "m(dot)melihmutlu(at)gmail(dot)com" <m(dot)melihmutlu(at)gmail(dot)com>, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>, "marcos(at)f10(dot)com(dot)br" <marcos(at)f10(dot)com(dot)br>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | RE: Time delayed LR (WAS Re: logical replication restrictions) |
Date: | 2023-02-03 06:35:29 |
Message-ID: | TYCPR01MB837337242FF0F5C50B997E76EDD79@TYCPR01MB8373.jpnprd01.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On Friday, February 3, 2023 2:21 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Fri, Feb 3, 2023 at 6:41 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> > On Thu, Feb 2, 2023 at 7:21 PM Takamichi Osumi (Fujitsu)
> > <osumi(dot)takamichi(at)fujitsu(dot)com> wrote:
> > >
> > ...
> > > > Besides, I am not sure it's a stable test to check the log. Is it
> > > > possible that there's no such log on a slow machine? I modified
> > > > the code to sleep 1s at the beginning of apply_dispatch(), then
> > > > the new added test failed because the server log cannot match.
> > > To get the log by itself is necessary to ensure that the delay is
> > > conducted by the apply worker, because we emit the diffms only if
> > > it's bigger than 0 in maybe_apply_delay(). If we omit the step, we
> > > are not sure the delay is caused by other reasons or the time-delayed
> feature.
> > >
> > > As you mentioned, it's possible that no log is emitted on slow
> > > machine. Then, the idea to make the test safer for such machines should
> be to make the delayed time longer.
> > > But we shortened the delay time to 1 second to mitigate the long test
> execution time of this TAP test.
> > > So, I'm not sure if it's a good idea to make it longer again.
> >
> > I think there are a couple of things that can be done about this problem:
> >
> > 1. If you need the code/test to remain as-is then at least the test
> > message could include some comforting text like "(this can fail on
> > slow machines when the delay time is already exceeded)" so then a test
> > failure will not cause undue alarm.
> >
> > 2. Try moving the DEBUG2 elog (in function maybe_apply_delay) so that
> > it will *always* log the remaining wait time even if that wait time
> > becomes negative. Then I think the test cases can be made
> > deterministic instead of relying on good luck. This seems like the
> > better option.
> >
>
> I don't understand why we have to do any of this instead of using 3s as
> min_apply_delay similar to what we are doing in
> src/test/recovery/t/005_replay_delay. Also, I think we should use exactly the
> same way to verify the test even though we want to keep the log level as
> DEBUG2 to check logs in case of any failures.
OK, will try to make our tests similar to the tests in 005_replay_delay
as much as possible.
> Also, I don't see the need to add more tests like the ones below:
> +# Test whether ALTER SUBSCRIPTION changes the delayed time of the apply
> +worker # (1 day 5 minutes). Note that the extra 5 minute is to account
> +for any # decoding/network overhead.
>
> Let's try to add tests similar to what we have for recovery_min_apply_delay
> unless there is some functionality in this patch that is not there in the
> recovery_min_apply_delay feature.
The above command is a preparation part to check a behavior unique to time-delayed
logical replication, which is to DISABLE a subscription causes the apply worker not to apply
the suspended (delayed) transaction. So, it will be OK to have this test.
Best Regards,
Takamichi Osumi
From | Date | Subject | |
---|---|---|---|
Next Message | Jim Jones | 2023-02-03 07:05:46 | Re: [PATCH] Add pretty-printed XML output option |
Previous Message | shveta malik | 2023-02-03 06:34:58 | Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication |