Re: Logical Replication of sequences

From: vignesh C <vignesh21(at)gmail(dot)com>
To: shveta malik <shveta(dot)malik(at)gmail(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Euler Taveira <euler(at)eulerto(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>
Subject: Re: Logical Replication of sequences
Date: 2025-08-21 16:38:42
Message-ID: CALDaNm0KvDnXENHy6F7=HXyprSpgTKN3DPdDLZ+fzk+=0yOw-Q@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 21 Aug 2025 at 11:49, shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> On Wed, Aug 20, 2025 at 2:25 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> >
> > On Tue, 19 Aug 2025 at 23:33, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > I imagined something like case 2. For logical replication of tables,
> > > if we support DDL replication (i.e., CREATE/ALTER/DROP TABLE), all
> > > changes the apply worker executes are serialized in commit LSN order.
> > > Therefore, users would not have to be concerned about schema changes
> > > that happened to the publisher. On the other hand, for sequence
> > > replication, even if we support DDL replication for sequences (i.e.,
> > > CREATE/ALTER/DROP SEQUENCES), users would have to execute REFRESH
> > > PUBLICATION SEQUENCES command after "ALTER SEQUENCE s1 MAXVALUE 12;"
> > > has been replicated on the subscriber. Otherwise, REFRESH PUBLICATION
> > > SEQUENCE command would fail because the sequence parameters no longer
> > > match.
> >
> > I am summarizing the challenges identified so far (assuming we have
> > DDL replication implemented through WAL support)
> > 1) Lack of sequence-synchronization resulting in DDL replication
> > failure/conflict.
> > On the subscriber, the sequence has advanced to 14:
> > SELECT currval('s1');
> > currval
> > ---------
> > 14
> >
> > On the publisher, the sequence is reset to 11 and MAXVALUE is changed to 12:
> > SELECT setval('s1', 11);
> > ALTER SEQUENCE s1 MAXVALUE 12;
> > If the subscriber did not execute REFRESH PUBLICATION SEQUENCES, DDL
> > replication will fail with error.
> > ERROR: RESTART value (14) cannot be greater than MAXVALUE (12)
> >
> > 2) Manual DDL on subscriber resulting in sequence synchronization failure.
> > On the subscriber, the sequence maxvalue is changed:
> > ALTER SEQUENCE s1 MAXVALUE 12;
> >
> > On the publisher, the sequence has advanced to 14:
> > SELECT currval('s1');
> > currval
> > ---------
> > 14
> >
> > REFRESH PUBLICATION SEQUENCES will fail because setting currvalue to
> > 14 is greater than the changed maxvalue 12 in the subscriber.
> >
> > 3) Out of order DDL and REFRESH resulting in synchronization failure.
> > Initially we have the same sequence on pub and sub. Then lets say pub
> > has done parameter change:
> > ALTER SEQUENCE s1 MAXVALUE 12;
> > Now if this DDL is somehow not replicated on sub, REFRESH PUBLICATION
> > SEQUENCES will fail initially and may work once DDL is replicated.
> > ~~
> > Problems 1 and 2 exist in both designs. While the WAL-based REFRESH
> > may seem slightly better for Problem 3 since REFRESH on the subscriber
> > will execute only after prior DDLs are replicated—even with the
> > sequence-sync worker, this isn't a major issue. If a user triggers
> > REFRESH before the DDL is replicated, the worker will refresh all
> > sequences except the mismatched one, and keep restarting and retrying
> > until the DDL is applied. Once that happens, the sequence sync
> > completes automatically, without the user doing another REFRESH.
> > Furthermore, the likelihood of a user executing REFRESH exactly during
> > the window between the DDL execution on the publisher and its
> > application on the subscriber seems relatively low.
> >
> > WAL-based approach OTOH introduces several additional challenges that
> > may outweigh its potential benefits:
> > 1) Increases load on WAL sender to collect sequence values. We are
> > talking about all the sequences here which could be huge in number.
> > 2) Table replication may stall until sequence conflicts are resolved.
> > The chances of hitting any conflict/error could be more here as
> > compared to tables specially when sequence synchronization is not
> > incremental and the number of sequences are huge. The continuous and
> > more frequent errors if not handled by users may even end up
> > invalidating the slot on primary.
> >
> > The worker approach neither blocks the apply worker in case of errors
> > nor adds extra load on the WAL sender. On its own, Case 3 doesn’t seem
> > significant enough to justify switching to a WAL-based design.
> > Overall, the worker-based approach appears to be less complex and a
> > better option.
> >
>
> Agree on this. Please find a few comments on the previous patch:
>
>
> 1)
> + Returns information about the sequence. <literal>last_value</literal>
> + last sequence value set in sequence by nextval or setval,
>
> <literal>last_value</literal> indicates ....

Modified

> 2)
> + * If 'resync_all_sequences' is true:
> + * Perform the above operation only for sequences.
>
> Shall we update:
> Perform the above operation only for sequences and resync all the
> sequences including existing ones.
>
> <old comment, I think somehow missed to be addressed.>

This code is removed now

> 3)
> + check_and_launch_sync_worker(nsyncworkers, InvalidOid,
> + &MyLogicalRepWorker->last_seqsync_start_time);
>
> Shall we simply name it as 'launch_sync_worker'.
> 'check' looks a little odd. All such functions (ex:
> logicalrep_worker_launch) has internal checks but the name need not to
> have 'check' keyword

Modified

> 4)
> + * Attempt to launch a sync worker (sequence or table) if there is a worker
> + * available and the retry interval has elapsed.
>
> shall we say:
> 'if there is a sync worker slot available' instead of 'if there is a
> worker available'

Modified

> 5)
> copy_sequences:
> + if (!sequence_rel || !HeapTupleIsValid(tup))
> + {
> + elog(LOG, "skip synchronization of sequence \"%s.%s\" because it has
> been dropped concurrently",
> + seqinfo->nspname, seqinfo->seqname);
> +
> + batch_skipped_count++;
> + continue;
> + }
>
> Is it possible that sequence_rel is valid while tuple is not? If
> possible, then do we need table_close before continuing?

I felt it is not possible as the lock on sequence has been acquired
successfully.

> 6)
> In copy_sequences() wherever we are using seqinfo->nspname,
> seqinfo->seqname; shall we directly use local vars nspname, seqname.

Modified

> 7)
> LogicalRepSyncSequences:
> + /* Skip if sequence was dropped concurrently */
> + sequence_rel = try_table_open(subrel->srrelid, RowExclusiveLock);
> + if (!sequence_rel)
> + continue;
>
> Here we are not checking tuple-validity like we did in copy_sequences
> (comment 5 above). I think this alone should suffice even in
> copy_sequences(). What do you think?

In this case we just need the sequence and schema name which is
available in sequence_rel whereas in case of copy_sequences we need
other sequence parameters like min, max, cycle etc which requires the
tuple. I felt the existing code is ok.

I have also addressed all the comments from [1] in the attached
v20250823 version patch.
[1] - https://www.postgresql.org/message-id/CAA4eK1%2BoVQW8oP%3DLo1X8qac6dzg-fgGQ6R_F_psfokUEqe%2Ba6w%40mail.gmail.com

Regards,
Vignesh

Attachment Content-Type Size
v20250823-0004-Update-ALTER-SUBSCRIPTION-REFRESH-to-ALTER.patch text/x-patch 8.9 KB
v20250823-0001-Enhance-pg_get_sequence_data-function.patch text/x-patch 7.4 KB
v20250823-0002-Introduce-ALL-SEQUENCES-support-for-Postgr.patch text/x-patch 110.7 KB
v20250823-0003-Reorganize-tablesync-Code-and-Introduce-sy.patch text/x-patch 24.6 KB
v20250823-0005-Introduce-REFRESH-PUBLICATION-SEQUENCES-fo.patch text/x-patch 34.7 KB
v20250823-0007-Documentation-for-sequence-synchronization.patch text/x-patch 33.9 KB
v20250823-0006-New-worker-for-sequence-synchronization-du.patch text/x-patch 84.7 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Corey Huinker 2025-08-21 16:52:17 Re: vacuumdb --missing-stats-only and permission issue
Previous Message Ranier Vilela 2025-08-21 16:17:56 Re: Weird error message from Postgres 18