Re: Time delayed LR (WAS Re: logical replication restrictions)

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, "Takamichi Osumi (Fujitsu)" <osumi(dot)takamichi(at)fujitsu(dot)com>, Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Marcos Pegoraro <marcos(at)f10(dot)com(dot)br>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Peter Smith <smithpb2250(at)gmail(dot)com>
Subject: Re: Time delayed LR (WAS Re: logical replication restrictions)
Date: 2022-11-14 10:33:55
Message-ID: CAA4eK1KMKt+_Qjw=WYtDXUg-tHr6hSFQZTZFVPrUZ=suK2j5kA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 14, 2022 at 2:28 PM Hayato Kuroda (Fujitsu)
<kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
>
> > I don't understand the reason for the below change in the patch:
> >
> > + /*
> > + * If this subscription has been disabled and it has an apply
> > + * delay set, wake up the logical replication worker to finish
> > + * it as soon as possible.
> > + */
> > + if (!opts.enabled && sub->applydelay > 0)
> > + logicalrep_worker_wakeup(sub->oid, InvalidOid);
> > +
> >
> > It seems to me Kuroda-San has proposed this change [1] to fix the test
> > but it is not clear to me why such a change is required. Why can't
> > CHECK_FOR_INTERRUPTS() after waiting, followed by the existing below
> > code [2] in LogicalRepApplyLoop() sufficient to handle parameter
> > updates?
> >
> > [2]
> > if (!in_remote_transaction && !in_streamed_transaction)
> > {
> > /*
> > * If we didn't get any transactions for a while there might be
> > * unconsumed invalidation messages in the queue, consume them
> > * now.
> > */
> > AcceptInvalidationMessages();
> > maybe_reread_subscription();
> > ...
>
> I mentioned the case with a long min_apply_delay configuration.
>
> The worker will exit normally if apply_delay() has been ended and then it can reach
> LogicalRepApplyLoop(). It works well if the delay is short and workers can wake up
> immediately. But if workers have long min_apply_delay, they cannot go out the
> while-loop, so worker processes remain for a long time. According to test code,
> it is determined that worker should die immediately and we have a
> test-case that we try to kill the worker with min_apply_delay = 1 day.
>

So, why only honor the 'disable' option of the subscription? For
example, one can change 'min_apply_delay' and it seems
recoveryApplyDelay() honors a similar change in the recovery
parameter. Is there a way to set the latch of the worker process, so
that it can recheck if anything is changed?

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2022-11-14 10:36:27 Re: hash_xlog_split_allocate_page: failed to acquire cleanup lock
Previous Message Amit Kapila 2022-11-14 10:15:23 Re: Time delayed LR (WAS Re: logical replication restrictions)