Re: Time delayed LR (WAS Re: logical replication restrictions)

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, "Yu Shi (Fujitsu)" <shiy(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>, "vignesh21(at)gmail(dot)com" <vignesh21(at)gmail(dot)com>, "shveta(dot)malik(at)gmail(dot)com" <shveta(dot)malik(at)gmail(dot)com>, "Takamichi Osumi (Fujitsu)" <osumi(dot)takamichi(at)fujitsu(dot)com>, "dilipbalaut(at)gmail(dot)com" <dilipbalaut(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, "euler(at)eulerto(dot)com" <euler(at)eulerto(dot)com>, "m(dot)melihmutlu(at)gmail(dot)com" <m(dot)melihmutlu(at)gmail(dot)com>, "marcos(at)f10(dot)com(dot)br" <marcos(at)f10(dot)com(dot)br>
Subject: Re: Time delayed LR (WAS Re: logical replication restrictions)
Date: 2023-05-12 02:07:38
Message-ID: CAD21AoDwD0upmpt=qZhQ0K9p8dHmVwunSkKh2dcg4H6o+_Ugzg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, May 11, 2023 at 2:04 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Fri, Apr 28, 2023 at 2:35 PM Hayato Kuroda (Fujitsu)
> <kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
> >
> > Dear hackers,
> >
> > I rebased and refined my PoC. Followings are the changes:
> >
>
> 1. Is my understanding correct that this patch creates the delay files
> for each transaction? If so, did you consider other approaches such as
> using one file to avoid creating many files?
> 2. For streaming transactions, first the changes are written in the
> temp file and then moved to the delay file. It seems like there is a
> double work. Is it possible to unify it such that when min_apply_delay
> is specified, we just use the delay file without sacrificing the
> advantages like stream sub-abort can truncate the changes?
> 3. Ideally, there shouldn't be a performance impact of this feature on
> regular transactions because the delay file is created only when
> min_apply_delay is active but better to do some testing of the same.
>

In addition to the points Amit raised, if the 'required_schema' option
is specified in START_REPLICATION, the publisher sends schema
information for every change. I think it leads to significant
overhead. Did you consider alternative approaches such as sending the
schema information for every transaction or the subscriber requests
the publisher to send it?

> Overall, I think such an approach can address comments by Sawada-San
> [1] but not sure if Sawada-San or others have any better ideas to
> achieve this feature. It would be good to see what others think of
> this approach.
>

I agree with this approach.

When it comes to the idea of writing logical changes to permanent
files, I think it would also be a good idea (and perhaps could be a
building block of this feature) that we write streamed changes to a
permanent file so that the apply worker can retry to apply them
without retrieving the same changes again from the publisher.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2023-05-12 03:39:44 Re: running logical replication as the subscription owner
Previous Message Bharath Rupireddy 2023-05-12 02:05:20 Re: WAL Insertion Lock Improvements