Re: Logical Replication of sequences

From: Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>
To: vignesh C <vignesh21(at)gmail(dot)com>
Cc: Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Euler Taveira <euler(at)eulerto(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>
Subject: Re: Logical Replication of sequences
Date: 2025-06-25 09:40:04
Message-ID: CANhcyEWKhHWFzpdAF6czbwq76NRDNCecDqQNtN6Bomn26mqHFw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, 22 Jun 2025 at 08:05, vignesh C <vignesh21(at)gmail(dot)com> wrote:
>
> On Thu, 19 Jun 2025 at 11:26, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com> wrote:
> >
> > Hi,
> >
> > Here are my review comments for v20250610 patches:
> >
> > Patch-0005:sequencesync.c
> >
> > 1) report_error_sequences()
> >
> > In case there are both missing and mismatched sequences, the ERROR
> > message logged is -
> >
> > ```
> > 2025-05-28 14:22:19.898 IST [392259] ERROR: logical replication
> > sequence synchronization failed for subscription "subs": sequences
> > ("public"."n84") are missing on the publisher. Additionally,
> > parameters differ for the remote and local sequences ("public.n1")
> > ```
> >
> > I feel this error message is quite long. Would it be possible to split
> > it into ERROR and DETAIL? Also, if feasible, we could consider
> > including a HINT, as was done in previous versions.
> >
> > I explored a few possible ways to log this error with a hint. Attached
> > top-up patch has the suggestion implemented. Please see if it seems
> > okay to consider.
>
> This looks good, merged it.
> > ~~~
> >
> > 2) copy_sequences():
> > + /* Retrieve the sequence object fetched from the publisher */
> > + for (int i = 0; i < batch_size; i++)
> > + {
> > + LogicalRepSequenceInfo *sequence_info =
> > lfirst(list_nth_cell(remotesequences, current_index + i));
> > +
> > + if (!strcmp(sequence_info->nspname, nspname) &&
> > + !strcmp(sequence_info->seqname, seqname))
> > + seqinfo = sequence_info;
> > + }
> >
> > The current logic performs a search through the local sequence list
> > for each sequence fetched from the publisher, repeating the traverse
> > of 100(batch size) length of the list per sequence, which may impact
> > performance.
> >
> > To improve efficiency, we can optimize it by sorting the local list
> > and traverses it only once for matching. Kindly review the
> > implementation in the attached top-up patch and consider merging it if
> > it looks good to you.
>
> Looks good, merged it.
>
> > ~~~
> >
> > 3) copy_sequences():
> > + if (message_level_is_interesting(DEBUG1))
> > + ereport(DEBUG1,
> > + errmsg_internal("logical replication synchronization for
> > subscription \"%s\", sequence \"%s\" has finished",
> > + MySubscription->name,
> > + seqinfo->seqname));
> > +
> > + batch_succeeded_count++;
> > + }
> >
> > The current debug log might be a bit confusing when sequences with the
> > same name exist in different schemas. To improve clarity, we could
> > include the schema name in the message, e.g.,
> > " ... sequence "schema"."sequence" has finished".
>
> Modified
>
> > ~~~~
> >
> > Few minor comments in doc - Patch-0006 : logical-replication.sgml
> >
> > 4)
> > + <para>
> > + To replicate sequences from a publisher to a subscriber, first publish them
> > + using <link linkend="sql-createpublication-params-for-all-sequences">
> > + <command>CREATE PUBLICATION ... FOR ALL SEQUENCES</command></link>.
> > + </para>
> >
> > I think it would be better to use "To synchronize" instead of "To
> > replicate" here, to maintain consistency and avoid confusion between
> > replication and synchronization.
>
> Modified
>
> > 5)
> > + <para>
> > + Update the sequences at the publisher side few times.
> > +<programlisting>
> >
> > /side few /side a few /
> >
> > 6) Can avoid using multiple "or" in the sentences below:
> >
> > 6a)
> > - a change set or replication set. Each publication exists in only
> > one database.
> > + generated from a table or a group of tables or the current state of all
> > + sequences, and might also be described as a change set or replication set
> >
> > / table or a group of tables/ table, a group of tables/
>
> Modified
>
> > 6b)
> > + Publications may currently only contain tables or sequences. Objects must be
> > + added explicitly, except when a publication is created using
> > + <literal>FOR TABLES IN SCHEMA</literal>, or <literal>FOR ALL
> > TABLES</literal>,
> > + or <literal>FOR ALL SEQUENCES</literal>. Unlike tables, the current state of
> >
> > / IN SCHEMA</literal>, or <literal>FOR ALL TABLES/ IN
> > SCHEMA</literal>, <literal>FOR ALL TABLES
>
> Modified
>
> Thanks for the comment, the attached v20250622 version patch has the
> changes for the same.
>
Hi Vignesh,

I have reviewed the 0004 patch. Here are my comments:

1. I think we need to update the below comment for function
AlterSubscription_refresh. I feel replacing 'tables' with 'relations'
would be sufficient.
/*
* Build qsorted array of local table oids for faster lookup. This can
* potentially contain all tables in the database so speed of lookup
* is important.
*/

2. Similarly as above comment.
/*
* Walk over the remote tables and try to match them to locally known
* tables. If the table is not known locally create a new state for
* it.
*
* Also builds array of local oids of remote tables for the next step.
*/

3. Similarly as above comment.
/*
* Next remove state for tables we should not care about anymore using
* the data we collected above
*/
Similarly for above comment.

4. Since we are not adding sequences in the list 'sub_remove_rels',
should we only palloc for (the count of no. of tables)? Is it worth
the effort?
/*
* Rels that we want to remove from subscription and drop any slots
* and origins corresponding to them.
*/
sub_remove_rels = palloc(subrel_count * sizeof(SubRemoveRels));

Thanks and Regards,
Shlok Kyal

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2025-06-25 09:57:03 Re: pg_dump misses comments on NOT NULL constraints
Previous Message Alexander Pyhalov 2025-06-25 09:37:51 Re: SCRAM pass-through authentication for postgres_fdw