Re: Time delayed LR (WAS Re: logical replication restrictions)

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>, "vignesh21(at)gmail(dot)com" <vignesh21(at)gmail(dot)com>, "shveta(dot)malik(at)gmail(dot)com" <shveta(dot)malik(at)gmail(dot)com>, "Takamichi Osumi (Fujitsu)" <osumi(dot)takamichi(at)fujitsu(dot)com>, "dilipbalaut(at)gmail(dot)com" <dilipbalaut(at)gmail(dot)com>, "euler(at)eulerto(dot)com" <euler(at)eulerto(dot)com>, "m(dot)melihmutlu(at)gmail(dot)com" <m(dot)melihmutlu(at)gmail(dot)com>, "marcos(at)f10(dot)com(dot)br" <marcos(at)f10(dot)com(dot)br>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Time delayed LR (WAS Re: logical replication restrictions)
Date: 2023-03-03 15:21:12
Message-ID: CAD21AoDbNPn0v6U4kOYXAgouDWJmunzz8xTohD=k6X5uQeoGmQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 2, 2023 at 1:07 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Thu, Mar 2, 2023 at 7:38 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Wed, Mar 1, 2023 at 6:21 PM Hayato Kuroda (Fujitsu)
> > <kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
> > >
> > > >
> > > > Apart from a bad-use case example I mentioned, in general, piling up
> > > > WAL files due to the replication slot has many bad effects on the
> > > > system. I'm concerned that the side effect of this feature (at least
> > > > of the current design) is too huge compared to the benefit, and afraid
> > > > that users might end up using this feature without understanding the
> > > > side effect well. It might be okay if we thoroughly document it but
> > > > I'm not sure.
> > >
> > > One approach is that change max_slot_wal_keep_size forcibly when min_send_delay
> > > is set. But it may lead to disable the slot because WALs needed by the time-delayed
> > > replication may be also removed. Just the right value cannot be set by us because
> > > it is quite depends on the min_send_delay and workload.
> > >
> > > How about throwing the WARNING when min_send_delay > 0 but
> > > max_slot_wal_keep_size < 0? Differ from previous, version the subscription
> > > parameter min_send_delay will be sent to publisher. Therefore, we can compare
> > > min_send_delay and max_slot_wal_keep_size when publisher receives the parameter.
> >
> > Since max_slot_wal_keep_size can be changed by reloading the config
> > file, each walsender warns it also at that time?
> >
>
> I think Kuroda-San wants to emit a WARNING at the time of CREATE
> SUBSCRIPTION. But it won't be possible to emit a WARNING at the time
> of ALTER SUBSCRIPTION. Also, as you say if the user later changes the
> value of max_slot_wal_keep_size, then even if we issue LOG/WARNING in
> walsender, it may go unnoticed. If we really want to give WARNING for
> this then we can probably give it as soon as user has set non-default
> value of min_send_delay to indicate that this can lead to retaining
> WAL on the publisher and they should consider setting
> max_slot_wal_keep_size.
>
> Having said that, I think users can always tune max_slot_wal_keep_size
> and min_send_delay (as none of these requires restart) if they see any
> indication of unexpected WAL size growth. There could be multiple ways
> to check it but I think one can refer wal_status in
> pg_replication_slots, the extended value can be an indicator of this.
>
> > Not sure it's
> > helpful. I think it's a legitimate use case to set min_send_delay > 0
> > and max_slot_wal_keep_size = -1, and users might not even notice the
> > WARNING message.
> >
>
> I think it would be better to tell about this in the docs along with
> the 'min_send_delay' description. The key point is whether this would
> be an acceptable trade-off for users who want to use this feature. I
> think it can harm only if users use this without understanding the
> corresponding trade-off. As we kept the default to no delay, it is
> expected from users using this have an understanding of the trade-off.

I imagine that a typical use case would be to set min_send_delay to
several hours to days. I'm concerned that it could not be an
acceptable trade-off for many users that the system cannot collect any
garbage during that.

I think we can have the apply process write the decoded changes
somewhere on the disk (as not temporary files) and return the flush
LSN so that the apply worker can apply them later and the publisher
can advance slot's LSN. The feature would be more complex but from the
user perspective it would be better.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Sébastien Lardière 2023-03-03 15:52:01 Re: Timeline ID hexadecimal format
Previous Message Dean Rasheed 2023-03-03 15:11:27 Re: Missing free_var() at end of accum_sum_final()?