Re: Time delayed LR (WAS Re: logical replication restrictions)

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: osumi(dot)takamichi(at)fujitsu(dot)com, smithpb2250(at)gmail(dot)com, vignesh21(at)gmail(dot)com, kuroda(dot)hayato(at)fujitsu(dot)com, shveta(dot)malik(at)gmail(dot)com, dilipbalaut(at)gmail(dot)com, euler(at)eulerto(dot)com, m(dot)melihmutlu(at)gmail(dot)com, andres(at)anarazel(dot)de, marcos(at)f10(dot)com(dot)br, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Time delayed LR (WAS Re: logical replication restrictions)
Date: 2023-01-24 06:57:58
Message-ID: CAA4eK1JTTCcMrsMAmz9LSEE6gvGHnTK0u1w0kg9hV6rcFFB5hQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jan 24, 2023 at 8:15 AM Kyotaro Horiguchi
<horikyota(dot)ntt(at)gmail(dot)com> wrote:
>
> > Attached the updated patch v19.
>
> + maybe_delay_apply(TransactionId xid, TimestampTz finish_ts)
>
> I look this spelling strange. How about maybe_apply_delay()?
>

+1.

>
> send_feedback():
> + * If the subscriber side apply is delayed (because of time-delayed
> + * replication) then do not tell the publisher that the received latest
> + * LSN is already applied and flushed, otherwise, it leads to the
> + * publisher side making a wrong assumption of logical replication
> + * progress. Instead, we just send a feedback message to avoid a publisher
> + * timeout during the delay.
> */
> - if (!have_pending_txes)
> + if (!have_pending_txes && !in_delayed_apply)
> flushpos = writepos = recvpos;
>
> Honestly I don't like this wart. The reason for this is the function
> assumes recvpos = applypos but we actually call it while holding
> unapplied changes, that is, applypos < recvpos.
>
> Couldn't we maintain an additional static variable "last_applied"
> along with last_received?
>

It won't be easy to maintain the meaning of last_applied because there
are cases where we don't apply the change directly. For example, in
case of streaming xacts, we will just keep writing it to the file,
now, say, due to some reason, we have to send the feedback, then it
will not allow you to update the latest write locations. This would
then become different then what we are doing without the patch.
Another point to think about is that we also need to keep the variable
updated for keep-alive ('k') messages even though we don't apply
anything in that case. Still, other cases to consider are where we
have mix of streaming and non-streaming transactions.

> In this case the condition cited above
> would be as follows and in_delayed_apply will become unnecessary.
>
> + if (!have_pending_txes && last_received == last_applied)
>
> The function is a static function and always called with a variable
> last_received that has the same scope with the function, as the first
> parameter. Thus we can remove the first parameter then let the
> function directly look at the both two varaibles instead.
>

I think this is true without this patch, so why that has not been
followed in the first place? One comment, I see in this regard is as
below:

/* It's legal to not pass a recvpos */
if (recvpos < last_recvpos)
recvpos = last_recvpos;

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2023-01-24 07:13:50 Re: Time delayed LR (WAS Re: logical replication restrictions)
Previous Message Michael Paquier 2023-01-24 06:57:56 Re: Generating code for query jumbling through gen_node_support.pl