| From: | vignesh C <vignesh21(at)gmail(dot)com> |
|---|---|
| To: | Ayush Tiwari <ayushtiwari(dot)slg01(at)gmail(dot)com> |
| Cc: | "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: [PATCH] Fix stale relation close in sequence synchronization |
| Date: | 2026-04-28 13:18:22 |
| Message-ID: | CALDaNm2VSvHVJYTji65qGZCsTHKcDZKNaHcNhaBZywC4yAA=3g@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Tue, 28 Apr 2026 at 18:04, Ayush Tiwari <ayushtiwari(dot)slg01(at)gmail(dot)com> wrote:
>
> Hi,
>
>
> On Tue, 28 Apr 2026 at 17:44, Hayato Kuroda (Fujitsu) <kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
>>
>> Dear Ayush,
>>
>> > I found a crash in the logical replication sequence synchronization worker
>> > when the publisher returns NULL sequence data for a sequence after at least
>> > one sequence in the same sync batch has already been processed.
>>
>> Good catch. I confirmed the HEAD can crash with your added test.
>
>
> Thanks for checking and confirming the crash.
>
>>
>>
>> > The attached patch clears the output Relation pointer at the start of
>> > get_and_validate_seq_info() and clears the local pointer in copy_sequences()
>> > after closing it. That keeps early returns from reusing a relation from a
>> > previous row.
>>
>> To confirm; can't we declare the sequence_rel in the inner-loop? My first
>> impression was the bug caused by the wrong lifetime. Are there any other
>> thoughts around here?
>
>
> I agree that declaring sequence_rel in the tuple-processing loop is
> cleaner. The relation belongs only to the current publisher result row,
> so limiting the variable's lifetime makes the intended ownership clearer
> and prevents any value from carrying over to the next row.
>
> I have kept the initialization of the output argument in
> get_and_validate_seq_info(), so every return path leaves it in a defined
> state. In v2, the caller-side pointer is declared inside the inner loop,
> and the explicit reset after table_close() is no longer needed.
>
> Attached is v2 with that change.
Thanks for finding and reporting the issue. I was able to reproduce
the issue with the steps you provided.
Few comments:
1) Here "SELECT nextval('regress_no_select');" and "REVOKE ALL ON
SEQUENCE regress_no_select FROM PUBLIC;" is not required for this test
case:
+$node_publisher->safe_psql(
+ 'postgres', qq(
+ CREATE ROLE regress_seq_repl LOGIN REPLICATION;
+ CREATE SEQUENCE regress_no_select;
+ SELECT nextval('regress_no_select');
+ GRANT USAGE ON SCHEMA public TO regress_seq_repl;
+ GRANT SELECT ON ALL SEQUENCES IN SCHEMA public TO regress_seq_repl;
+ REVOKE ALL ON SEQUENCE regress_no_select FROM PUBLIC;
+ REVOKE ALL ON SEQUENCE regress_no_select FROM regress_seq_repl;
+));
2) Since the comment about “dropped concurrently” has been removed,
could you merge that context into the new wording:
/*
- * last_value can be NULL if the sequence was dropped concurrently (see
- * pg_get_sequence_data()).
+ * The sequence data can be NULL if it is not accessible on the publisher
+ * (see pg_get_sequence_data()).
*/
Something like:
/*
* The sequence data can be NULL due to insufficient privileges or if
* the sequence was dropped concurrently (see pg_get_sequence_data()).
*/
3) Can we change this:
##########
# A NULL sequence data row from the publisher must not make the subscriber
# close the previously synchronized sequence relation again.
##########
To something like:
##########
# Ensure that insufficient privileges on the publisher for a sequence
# do not disrupt the subscriber. The subscriber should log a warning
# and continue retrying.
##########
Regards,
Vignesh
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tom Lane | 2026-04-28 13:29:41 | Re: [BUG?] macOS (Intel) build warnings: "ranlib: file … has no symbols" for aarch64 objects |
| Previous Message | Ayush Tiwari | 2026-04-28 12:34:40 | Re: [PATCH] Fix stale relation close in sequence synchronization |