From: | shveta malik <shveta(dot)malik(at)gmail(dot)com> |
---|---|
To: | Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com> |
Cc: | vignesh C <vignesh21(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Euler Taveira <euler(at)eulerto(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com> |
Subject: | Re: Logical Replication of sequences |
Date: | 2025-06-27 03:20:03 |
Message-ID: | CAJpy0uCYDdyNhx6owi5GDUQP9k0bWt9BmVtJRbojVOpGYHT8rA@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Jun 25, 2025 at 7:42 PM Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com> wrote:
>
> On Sun, 22 Jun 2025 at 08:05, vignesh C <vignesh21(at)gmail(dot)com> wrote:
> >
> > On Thu, 19 Jun 2025 at 11:26, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com> wrote:
> > >
> > > Hi,
> > >
> > > Here are my review comments for v20250610 patches:
> > >
> > > Patch-0005:sequencesync.c
> > >
> > > 1) report_error_sequences()
> > >
> > > In case there are both missing and mismatched sequences, the ERROR
> > > message logged is -
> > >
> > > ```
> > > 2025-05-28 14:22:19.898 IST [392259] ERROR: logical replication
> > > sequence synchronization failed for subscription "subs": sequences
> > > ("public"."n84") are missing on the publisher. Additionally,
> > > parameters differ for the remote and local sequences ("public.n1")
> > > ```
> > >
> > > I feel this error message is quite long. Would it be possible to split
> > > it into ERROR and DETAIL? Also, if feasible, we could consider
> > > including a HINT, as was done in previous versions.
> > >
> > > I explored a few possible ways to log this error with a hint. Attached
> > > top-up patch has the suggestion implemented. Please see if it seems
> > > okay to consider.
> >
> > This looks good, merged it.
> > > ~~~
> > >
> > > 2) copy_sequences():
> > > + /* Retrieve the sequence object fetched from the publisher */
> > > + for (int i = 0; i < batch_size; i++)
> > > + {
> > > + LogicalRepSequenceInfo *sequence_info =
> > > lfirst(list_nth_cell(remotesequences, current_index + i));
> > > +
> > > + if (!strcmp(sequence_info->nspname, nspname) &&
> > > + !strcmp(sequence_info->seqname, seqname))
> > > + seqinfo = sequence_info;
> > > + }
> > >
> > > The current logic performs a search through the local sequence list
> > > for each sequence fetched from the publisher, repeating the traverse
> > > of 100(batch size) length of the list per sequence, which may impact
> > > performance.
> > >
> > > To improve efficiency, we can optimize it by sorting the local list
> > > and traverses it only once for matching. Kindly review the
> > > implementation in the attached top-up patch and consider merging it if
> > > it looks good to you.
> >
> > Looks good, merged it.
> >
> > > ~~~
> > >
> > > 3) copy_sequences():
> > > + if (message_level_is_interesting(DEBUG1))
> > > + ereport(DEBUG1,
> > > + errmsg_internal("logical replication synchronization for
> > > subscription \"%s\", sequence \"%s\" has finished",
> > > + MySubscription->name,
> > > + seqinfo->seqname));
> > > +
> > > + batch_succeeded_count++;
> > > + }
> > >
> > > The current debug log might be a bit confusing when sequences with the
> > > same name exist in different schemas. To improve clarity, we could
> > > include the schema name in the message, e.g.,
> > > " ... sequence "schema"."sequence" has finished".
> >
> > Modified
> >
> > > ~~~~
> > >
> > > Few minor comments in doc - Patch-0006 : logical-replication.sgml
> > >
> > > 4)
> > > + <para>
> > > + To replicate sequences from a publisher to a subscriber, first publish them
> > > + using <link linkend="sql-createpublication-params-for-all-sequences">
> > > + <command>CREATE PUBLICATION ... FOR ALL SEQUENCES</command></link>.
> > > + </para>
> > >
> > > I think it would be better to use "To synchronize" instead of "To
> > > replicate" here, to maintain consistency and avoid confusion between
> > > replication and synchronization.
> >
> > Modified
> >
> > > 5)
> > > + <para>
> > > + Update the sequences at the publisher side few times.
> > > +<programlisting>
> > >
> > > /side few /side a few /
> > >
> > > 6) Can avoid using multiple "or" in the sentences below:
> > >
> > > 6a)
> > > - a change set or replication set. Each publication exists in only
> > > one database.
> > > + generated from a table or a group of tables or the current state of all
> > > + sequences, and might also be described as a change set or replication set
> > >
> > > / table or a group of tables/ table, a group of tables/
> >
> > Modified
> >
> > > 6b)
> > > + Publications may currently only contain tables or sequences. Objects must be
> > > + added explicitly, except when a publication is created using
> > > + <literal>FOR TABLES IN SCHEMA</literal>, or <literal>FOR ALL
> > > TABLES</literal>,
> > > + or <literal>FOR ALL SEQUENCES</literal>. Unlike tables, the current state of
> > >
> > > / IN SCHEMA</literal>, or <literal>FOR ALL TABLES/ IN
> > > SCHEMA</literal>, <literal>FOR ALL TABLES
> >
> > Modified
> >
> > Thanks for the comment, the attached v20250622 version patch has the
> > changes for the same.
> >
> Hi Vignesh,
>
> I tested with all patches applied. I have a comment:
>
> Let consider following case:
> On publisher create a publication pub1 on all sequence. publication
> has sequence s1. The curr value of s1 is 2
> and On subscriber we have subscription on pub1 and sequence s1 has
> value 5. Now we run:
> "ALTER SUBSCRIPTION sub1 REFRESH PUBLICATION SEQUENCES"
>
> Now on subscriber currval still show '5':
> postgres=# select currval('s1');
> currval
> ---------
> 5
> (1 row)
>
> But when we do nextval on s1 on subscriber we get '3'. Which is
> correct assuming sequence is synced:
> postgres=# select nextval('s1');
> nextval
> ---------
> 3
> (1 row)
> postgres=# select currval('s1');
> currval
> ---------
> 3
> (1 row)
>
> Is this behaviour expected? I feel the initial " select
> currval('s1');" should have displayed '2'. Thoughts?
As per docs at [1], currval returns the value most recently obtained
by nextval for this sequence in the current session. So behaviour is
in alignment with docs. I tested this across multiple sessions,
regardless of whether sequence synchronization occurs or not. The
behavior remains consistent across sessions. As an example, if in
session2 the sequence has advanced to a value of 10 by calling
nextval, the currval in session1 still shows the previous value of 3,
which was obtained by nextval in session1. So, it seems expected to
me. But let's wait for Vignesh's comments on this.
[1]: https://www.postgresql.org/docs/current/functions-sequence.html
thanks
Shveta
From | Date | Subject | |
---|---|---|---|
Next Message | Fujii Masao | 2025-06-27 03:22:17 | Re: Extend COPY FROM with HEADER <integer> to skip multiple lines |
Previous Message | Shinya Kato | 2025-06-27 02:55:18 | Re: Extend COPY FROM with HEADER <integer> to skip multiple lines |