Re: logical decoding and replication of sequences, take 2

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Subject: Re: logical decoding and replication of sequences, take 2
Date: 2023-07-26 04:07:39
Message-ID: CAA4eK1KAP0BmfZG4Pcvt6=vcdWT7p5J0-MtJySPhnfj4aDYE-g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jul 25, 2023 at 5:29 PM Tomas Vondra
<tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>
> On 7/25/23 08:28, Amit Kapila wrote:
> > On Mon, Jul 24, 2023 at 9:32 PM Tomas Vondra
> > <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
> >>
> >> On 7/24/23 12:40, Amit Kapila wrote:
> >>> On Wed, Jul 5, 2023 at 8:21 PM Ashutosh Bapat
> >>> <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
> >>>
> >>> Even after that, see below the value of the sequence is still not
> >>> caught up. Later, when the apply worker processes all the WAL, the
> >>> sequence state will be caught up.
> >>>
> >>
> >> And how is this different from what tablesync does for tables? For that
> >> 'r' also does not mean it's fully caught up, IIRC. What matters is
> >> whether the sequence since this moment can go back. And I don't think it
> >> can, because that would require replaying changes from before we did
> >> copy_sequence ...
> >>
> >
> > For sequences, it is quite possible that we replay WAL from before the
> > copy_sequence whereas the same is not true for tables (w.r.t
> > copy_table()). This is because for tables we have a kind of interlock
> > w.r.t LSN returned via create_slot (say this value of LSN is LSN1),
> > basically, the walsender corresponding to tablesync worker in
> > publisher won't send any WAL before that LSN whereas the same is not
> > true for sequences. Also, even if apply worker can receive WAL before
> > copy_table, it won't apply that as that would be behind the LSN1 and
> > the same is not true for sequences. So, for tables, we will never go
> > back to a state before the copy_table() but for sequences, we can go
> > back to a state before copy_sequence().
> >
>
> Right. I think the important detail is that during sync we have three
> important LSNs
>
> - LSN1 where the slot is created
> - LSN2 where the copy happens
> - LSN3 where we consider the sync completed
>
> For tables, LSN1 == LSN2, because the data is completed using the
> snapshot from the temporary slot. And (LSN1 <= LSN3).
>
> But for sequences, the copy happens after the slot creation, possibly
> with (LSN1 < LSN2). And because LSN3 comes from the main subscription
> (which may be a bit behind, for whatever reason), it may happen that
>
> (LSN1 < LSN3 < LSN2)
>
> The the sync ends at LSN3, but that means all sequence changes between
> LSN3 and LSN2 will be applied "again" making the sequence go away.
>

Yeah, the problem is something as you explained but an additional
minor point is that for sequences we also do end up applying the WAL
between LSN1 and LSN3 which makes it go backwards. The ideal way is
that sequences on subscribers never go backward in a way that is
visible to users. I will share my thoughts after studying your
proposal in a later email.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2023-07-26 04:13:39 Re: Logical walsenders don't process XLOG_CHECKPOINT_SHUTDOWN
Previous Message Thomas Munro 2023-07-26 04:06:31 Re: odd buildfarm failure - "pg_ctl: control file appears to be corrupt"