Quick Links

RE: [bug fix] prepared transaction might be lost when max_prepared_transactions is zero on the subscriber

From:	"Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc:	"Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject:	RE: [bug fix] prepared transaction might be lost when max_prepared_transactions is zero on the subscriber
Date:	2025-12-23 06:42:11
Message-ID:	TY4PR01MB1690740E94C11F6EA9072EDDD94B5A@TY4PR01MB16907.jpnprd01.prod.outlook.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tuesday, December 23, 2025 12:21 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Mon, Dec 22, 2025 at 2:31 PM Zhijie Hou (Fujitsu)
> <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> >
> > When reviewing some parallel apply related codes, I noticed a bug in
> > the parallel apply worker, similar to the issue discussed in this thread.
> >
> > The issue is that the logical replication parallel apply worker may
> > erroneously advance the origin progress during an error or
> > unsuccessful apply. This can lead to transaction loss, as these transactions
> will not be resent by the server.
> > Commit 3f28b2fc addressed a similar issue in both the apply worker and
> > table sync worker.
> >
> > The original fix involved registering a before_shmem_exit callback to
> > reset the origin information, preventing the worker from advancing it
> > during transaction abortion on shutdown. The attached patch registers
> > the same callback for the parallel apply worker, ensuring consistent
> behavior across all workers.
> >
>
> Thanks for reporting the issue and patch.
>
> +# Test the ability to re-apply a transaction when a parallel apply
> +worker fails # to prepare the transaction due to insufficient
> +max_prepared_transactions # setting.
> +$node_subscriber->append_conf('postgresql.conf',
>
> How does the test ensure that error is raised by parallel apply worker? I see
> that in the previous test, we set 'debug_logical_replication_streaming =
> immediate', so that should help to invoke parallel apply worker. But is there a
> more direct way to ensure the same? Can we test for LOG like: "ERROR:
> logical replication parallel apply worker exited due to error"?

OK, I have added a general log test for "ERROR .. logical replication parallel
apply worker ..." to ensure that it's the parallel apply worker that failed to
apply the transaction.

And I also addressed the comments by Li Chao.

Here are the updated patches for all branches.

Best Regards,
Hou zj

Attachment	Content-Type	Size
v2_PG16-0001-Fix-unexpected-origin-advancement-during-par.patch	application/octet-stream	5.0 KB
v2_PG17_PG18-0001-Fix-unexpected-origin-advancement-durin.patch	application/octet-stream	5.0 KB
v2_HEAD-0001-Fix-unexpected-origin-advancement-during-par.patch	application/octet-stream	5.0 KB

In response to

Re: [bug fix] prepared transaction might be lost when max_prepared_transactions is zero on the subscriber at 2025-12-23 04:21:15 from Amit Kapila

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Peter Smith	2025-12-23 06:49:21	Re: Proposal: Conflict log history table for Logical Replication
Previous Message	Shlok Kyal	2025-12-23 06:32:49	Re: Skipping schema changes in publication