Re: logical decoding and replication of sequences, take 2

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi>
Subject: Re: logical decoding and replication of sequences, take 2
Date: 2023-03-20 12:26:16
Message-ID: CAA4eK1L4T85q0Yx=CDEtXG=Ti+zWHMnq1KgP7YuGTy+FuGRUyA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 20, 2023 at 5:13 PM Tomas Vondra
<tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>
> On 3/20/23 12:00, Amit Kapila wrote:
> > On Mon, Mar 20, 2023 at 1:49 PM Tomas Vondra
> > <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
> >>
> >>
> >> I don't understand why we'd need WAL from before the slot is created,
> >> which happens before copy_sequence so the sync will see a more recent
> >> state (reflecting all changes up to the slot LSN).
> >>
> >
> > Imagine the following sequence of events:
> > 1. Operation on a sequence seq-1 which requires WAL. Say, this is done
> > at LSN 1000.
> > 2. Some other random operations on unrelated objects. This would
> > increase LSN to 2000.
> > 3. Create a slot that uses current LSN 2000.
> > 4. Copy sequence seq-1 where you will get the LSN value as 1000. Then
> > you will use LSN 1000 as a starting point to start replication in
> > sequence sync worker.
> >
> > It is quite possible that WAL from LSN 1000 may not be present. Now,
> > it may be possible that we use the slot's LSN in this case but
> > currently, it may not be possible without some changes in the slot
> > machinery. Even, if we somehow solve this, we have the below problem
> > where we can miss some concurrent activity.
> >
>
> I think the question is what would be the WAL-requiring operation at LSN
> 1000. If it's just regular nextval(), then we *will* see it during
> copy_sequence - sequences are not transactional in the MVCC sense.
>
> If it's an ALTER SEQUENCE, I guess it might create a new relfilenode,
> and then we might fail to apply this - that'd be bad.
>
> I wonder if we'd allow actually discarding the WAL while building the
> consistent snapshot, though.
>

No, as soon as we reserve the WAL location, we update the slot's
minLSN (replicationSlotMinLSN) which would prevent the required WAL
from being removed.

> You're however right we can't just decide
> this based on LSN, we'd probably need to compare the relfilenodes too or
> something like that ...
>
> >> I think the only "issue" are the WAL records after the slot LSN, or more
> >> precisely deciding which of the decoded changes to apply.
> >>
> >>
> >>> Now, for the second idea which is to directly use
> >>> pg_current_wal_insert_lsn(), I think we won't be able to ensure that
> >>> the changes covered by in-progress transactions like the one with
> >>> Alter Sequence I have given example would be streamed later after the
> >>> initial copy. Because the LSN returned by pg_current_wal_insert_lsn()
> >>> could be an LSN after the LSN associated with Alter Sequence but
> >>> before the corresponding xact's commit.
> >>
> >> Yeah, I think you're right - the locking itself is not sufficient to
> >> prevent this ordering of operations. copy_sequence would have to lock
> >> the sequence exclusively, which seems bit disruptive.
> >>
> >
> > Right, that doesn't sound like a good idea.
> >
>
> Although, maybe we could use a less strict lock level? I mean, one that
> allows nextval() to continue, but would conflict with ALTER SEQUENCE.
>

I don't know if that is a good idea but are you imagining a special
interface/mechanism just for logical replication because as far as I
can see you have used SELECT to fetch the sequence values?

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2023-03-20 12:33:56 Re: [PoC] Improve dead tuple storage for lazy vacuum
Previous Message Hayato Kuroda (Fujitsu) 2023-03-20 12:13:35 Question: Do we have a rule to use "PostgreSQL" and "<productname>PostgreSQL</productname>" separately?