Re: [bug fix] prepared transaction might be lost when max_prepared_transactions is zero on the subscriber

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: [bug fix] prepared transaction might be lost when max_prepared_transactions is zero on the subscriber
Date: 2025-12-23 04:21:15
Message-ID: CAA4eK1+7QbQz1ZPQiZ4oPNZvTeqpNefK25k7vcOeVBLUXN1dNA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Dec 22, 2025 at 2:31 PM Zhijie Hou (Fujitsu)
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> When reviewing some parallel apply related codes, I noticed a bug in the
> parallel apply worker, similar to the issue discussed in this thread.
>
> The issue is that the logical replication parallel apply worker may erroneously
> advance the origin progress during an error or unsuccessful apply. This can lead
> to transaction loss, as these transactions will not be resent by the server.
> Commit 3f28b2fc addressed a similar issue in both the apply worker and table
> sync worker.
>
> The original fix involved registering a before_shmem_exit callback to reset the
> origin information, preventing the worker from advancing it during transaction
> abortion on shutdown. The attached patch registers the same callback for the
> parallel apply worker, ensuring consistent behavior across all workers.
>

Thanks for reporting the issue and patch.

+# Test the ability to re-apply a transaction when a parallel apply worker fails
+# to prepare the transaction due to insufficient max_prepared_transactions
+# setting.
+$node_subscriber->append_conf('postgresql.conf',

How does the test ensure that error is raised by parallel apply
worker? I see that in the previous test, we set
'debug_logical_replication_streaming = immediate', so that should help
to invoke parallel apply worker. But is there a more direct way to
ensure the same? Can we test for LOG like: "ERROR: logical
replication parallel apply worker exited due to error"?

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2025-12-23 04:51:16 Re: Two issues with version checks in CREATE SUBSCRIPTION
Previous Message Chao Li 2025-12-23 04:05:11 Re: Why is_admin_of_role() use ROLERECURSE_MEMBERS rather than ROLERECURSE_PRIVS?