Re: Logical Replication of sequences

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Euler Taveira <euler(at)eulerto(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>
Subject: Re: Logical Replication of sequences
Date: 2025-08-22 04:08:25
Message-ID: CAA4eK1K-aWUz5O_sHJACn0un2c71pOJUJYeKGc-2xOAxbSUTaQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 21, 2025 at 10:52 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Wed, Aug 20, 2025 at 9:04 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Wed, Aug 20, 2025 at 11:00 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > On Tue, Aug 19, 2025 at 9:14 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > >
> > > > If so, I don't think we can do much with the design
> > > > choice we made. During DDL replication of sequences, we need to
> > > > consider it as a conflict.
> > > >
> > > > BTW, note that the same situation can happen even when the user
> > > > manually changed the sequence value on the subscriber in some way. So,
> > > > we can't prevent that.
> > >
> > > Yes, I understand that conflicts can occur when users manually modify
> > > sequence values or parameters on the subscriber. However, in Vignesh's
> > > example, users are only executing the REFRESH command, without
> > > performing any ALTER SEQUENCE commands or setval() operations on the
> > > subscriber. In this scenario, I don't see why conflicts would arise
> > > even with DDL replication in place.
> > >
> >
> > This is because DDL can also fail if the existing sequence data does
> > not adhere to the DDL change. This will be true even for tables, but
> > let's focus on the sequence case. See below part of the example:
> >
> > -- Subscriber
> > ALTER SUBSCRIPTION sub1 REFRESH PUBLICATION SEQUENCES;
> > SELECT currval('s1');
> > currval
> > ---------
> > 14
> >
> > -- Now on the publisher:
> > SELECT setval('s1', 11);
> > ALTER SEQUENCE s1 MAXVALUE 12;
> >
> > When applying the DDL change on the subscriber:
> > ERROR: RESTART value (14) cannot be greater than MAXVALUE (12)
> >
> > Here the user has intentionally reduced the existing value of the
> > sequence to (11) on the publisher after the REFRESH command and then
> > performed a DDL that is compatible with the latest RESTART value (11).
> > Now, because we did REFRESH before the user set the value of sequence
> > as 11, the current value on the subscriber will be 14. When we
> > replicate the DDL, it will find the latest RESTART value as (14)
> > greater than DDL's changed MAXVALUE (12), so it will fail, and the
> > subscriber will retry. Users have to manually perform REFRESH once
> > again, or maybe as part of a conflict resolution strategy, we can do
> > this internally. IIUC, we can't avoid this even if we start writing
> > WAL for the REFRESH command on the publisher.
>
> Right. Since DMLs and DDLs for sequences are replicated and applied to
> the subscriber out of order even if we write WAL for the REFRESH
> command.
>
> On the other hand, there is a scenario where we can cover with the
> idea of writing a WAL for the REFRESH command:
>
> -- Publisher
> CREATE s as integer;
> select setval('s', pow(2,31)::int)
>
> -- Subscriber
> CREATE s as integer;
> ALTER SUBSCRIPTION sub1 REFRESH PUBLICATION SEQUENCES;
> -- the last value of 's' is 1073741824
>
> -- Publisher
> alter sequence s as bigint;
> select setval('s', pow(2,50)::bigint);
>
> -- Subscriber
> ALTER SUBSCRIPTION sub1 REFRESH PUBLICATION SEQUENCES;
> -- sequence synchronization keeps failing due to mismatch sequence
> definition until ALTER SEQUENCE DDL is applied to the subscriber.
>
> I'm not suggesting to change the current approach but I'd just like to
> figure out how sequence replication will work with future DDL
> replication if we implement sequence synchronization as a logical
> replication feature.
>

I think we can have a conflict handler for
sequence_definition_mismatch where either it LOGs such that the user
needs to retry the operation after some time, or let it automatically
wait and retry, or a combination of both. As we are already working on
conflict handling (conflict detection, storage, and resolution), we
will at least have a way to store and let users be aware of such a
conflict, but in the best case, we will have conflict resolution as
well by the time replication of DDL sequence will be in a position to
land. Do you have better ideas?

BTW, do you have any suggestions on the first two design points raised
by me in email [1]?

[1] - https://www.postgresql.org/message-id/CAA4eK1%2BoVQW8oP%3DLo1X8qac6dzg-fgGQ6R_F_psfokUEqe%2Ba6w%40mail.gmail.com

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2025-08-22 05:09:51 Re: Remove traces of long in dynahash.c
Previous Message jian he 2025-08-22 03:36:45 Re: Add SPLIT PARTITION/MERGE PARTITIONS commands