| From: | vignesh C <vignesh21(at)gmail(dot)com> |
|---|---|
| To: | shveta malik <shveta(dot)malik(at)gmail(dot)com> |
| Cc: | Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Euler Taveira <euler(at)eulerto(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org> |
| Subject: | Re: Logical Replication of sequences |
| Date: | 2025-05-21 17:36:54 |
| Message-ID: | CALDaNm0ZjHXE0G_5B9Eo2Dhwxwx=enTgFy=GkFS80MqKxbLs2w@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Tue, 20 May 2025 at 09:54, shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> On Tue, May 20, 2025 at 8:35 AM Nisha Moond <nisha(dot)moond412(at)gmail(dot)com> wrote:
> >
> > >
> > > Thanks for the comments, these are handled in the attached v20250516
> > > version patch.
> > >
> >
> > Thanks for the patches. Here are my review comments -
> >
> > Patch-0004: src/backend/replication/logical/sequencesync.c
> >
> > The sequence count logic using curr_seq in copy_sequences() seems buggy.
> > Currently, curr_seq is incremented based on the number of tuples
> > received from the publisher inside the inner while loop.
> > This means it's counting the number of sequences returned by the
> > publisher, not the number of sequences processed locally. This can
> > lead to two issues:
> >
> > 1) Repeated syncing of sequences:
> > If some sequences are missing on the publisher, curr_seq will reflect
> > fewer items than expected, and subsequent batches may reprocess
> > already-synced sequences. Because next batch will use curr_seq to get
> > values from the list as -
> >
> > seqinfo = (LogicalRepSequenceInfo *)
> > lfirst(list_nth_cell(remotesequences, curr_seq + i));
> >
> > Example:
> > For 110 sequences(s1 to s110), if 5 (s1 to s5) are missing on the
> > publisher in the first batch, curr_seq = 95. In the next cycle, we
> > resync s95 to s99.
> > ~~~~
> >
> > 2) Risk of sequencesync worker getting stuck in infinite loop
> >
> > Consider a case where remotesequences has 10 sequences (s1–s10) need
> > syncing, and concurrently s9, s10 are deleted on the publisher.
> >
> > Cycle 1:
> > Publisher returns s1–s8. So curr_seq = 8.
> >
> > Cycle 2:
> > Publisher query returns zero rows (as s9, s10 no longer exist).
> > curr_seq stays at 8 and never advances.
> >
> > This causes the while (curr_seq < total_seq) loop to run forever.
> > ~~~~
> >
> > I think curr_seq should be incremented by batch_seq_count just
> > outside the inner while loop.
> >
>
> I faced the similar issue while testing. I think it is due to the
> code-logic issue pointed out by Nisha above.
Yes, it is the same issue, this has been fixed in the v20250521
version posted at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm2ZgyYbowqZJfpkpRV_tev5o-rqpkLDkp496ku15Tdsqw%40mail.gmail.com
Regards,
Vignesh
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Robert Haas | 2025-05-21 17:47:17 | Re: Avoid orphaned objects dependencies, take 3 |
| Previous Message | vignesh C | 2025-05-21 17:34:57 | Re: Logical Replication of sequences |