Re: Logical Replication of sequences

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Euler Taveira <euler(at)eulerto(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>
Subject: Re: Logical Replication of sequences
Date: 2025-05-21 17:34:57
Message-ID: CALDaNm2ZgyYbowqZJfpkpRV_tev5o-rqpkLDkp496ku15Tdsqw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 20 May 2025 at 08:35, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com> wrote:
>
> >
> > Thanks for the comments, these are handled in the attached v20250516
> > version patch.
> >
>
> Thanks for the patches. Here are my review comments -
>
> Patch-0004: src/backend/replication/logical/sequencesync.c
>
> The sequence count logic using curr_seq in copy_sequences() seems buggy.
> Currently, curr_seq is incremented based on the number of tuples
> received from the publisher inside the inner while loop.
> This means it's counting the number of sequences returned by the
> publisher, not the number of sequences processed locally. This can
> lead to two issues:
>
> 1) Repeated syncing of sequences:
> If some sequences are missing on the publisher, curr_seq will reflect
> fewer items than expected, and subsequent batches may reprocess
> already-synced sequences. Because next batch will use curr_seq to get
> values from the list as -
>
> seqinfo = (LogicalRepSequenceInfo *)
> lfirst(list_nth_cell(remotesequences, curr_seq + i));
>
> Example:
> For 110 sequences(s1 to s110), if 5 (s1 to s5) are missing on the
> publisher in the first batch, curr_seq = 95. In the next cycle, we
> resync s95 to s99.
> ~~~~
>
> 2) Risk of sequencesync worker getting stuck in infinite loop
>
> Consider a case where remotesequences has 10 sequences (s1–s10) need
> syncing, and concurrently s9, s10 are deleted on the publisher.
>
> Cycle 1:
> Publisher returns s1–s8. So curr_seq = 8.
>
> Cycle 2:
> Publisher query returns zero rows (as s9, s10 no longer exist).
> curr_seq stays at 8 and never advances.
>
> This causes the while (curr_seq < total_seq) loop to run forever.

These are handled in the attached v20250521 version patch.
Also the issue reported at [1] is handled in the attached patch.

[1] - https://www.postgresql.org/message-id/CAHut%2BPstucunJLQn8C%3DbewmYdoSQStBcEJgG2bkZJUZnTowhFQ%40mail.gmail.com

Regards,
Vignesh

Attachment Content-Type Size
v20250521-0001-Introduce-pg_sequence_state-function-for-e.patch text/x-patch 7.2 KB
v20250521-0005-Documentation-for-sequence-synchronization.patch text/x-patch 27.0 KB
v20250521-0003-Reorganize-tablesync-Code-and-Introduce-sy.patch text/x-patch 23.0 KB
v20250521-0002-Introduce-ALL-SEQUENCES-support-for-Postgr.patch text/x-patch 106.5 KB
v20250521-0004-Enhance-sequence-synchronization-during-su.patch text/x-patch 108.7 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2025-05-21 17:36:54 Re: Logical Replication of sequences
Previous Message Rustam ALLAKOV 2025-05-21 17:31:38 Re: Allow CI to only run the compiler warnings task