From: | Nisha Moond <nisha(dot)moond412(at)gmail(dot)com> |
---|---|
To: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com> |
Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Fix slot synchronization with two_phase decoding enabled |
Date: | 2025-05-21 04:48:03 |
Message-ID: | CABdArM6XdTMjPXq0d6GWNHz9KHTB+RaVx=aJU-9_TaqVTND4Pg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, May 6, 2025 at 4:52 PM Zhijie Hou (Fujitsu)
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Mon, May 5, 2025 at 6:59 PM Amit Kapila wrote:
> >
> > On Sun, May 4, 2025 at 2:33 PM Masahiko Sawada
> > <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > While I cannot be entirely certain of my analysis, I believe the root
> > > cause might be related to the backward movement of the confirmed_flush
> > > LSN. The following scenario seems possible:
> > >
> > > 1. The walsender enables the two_phase and sets two_phase_at (which
> > > should be the same as confirmed_flush).
> > > 2. The slot's confirmed_flush regresses for some reason.
> > > 3. The slotsync worker retrieves the remote slot information and
> > > enables two_phase for the local slot.
> > >
> >
> > Yes, this is possible. Here is my theory as to how it can happen in the current
> > case. In the failed test, after the primary has prepared a transaction, the
> > transaction won't be replicated to the subscriber as two_phase was not
> > enabled for the slot. However, subsequent keepalive messages can send the
> > latest WAL location to the subscriber and get the confirmation of the same from
> > the subscriber without its origin being moved. Now, after we restart the apply
> > worker (due to disable/enable for a subscription), it will use the previous
> > origin_lsn to temporarily move back the confirmed flush LSN as explained in
> > one of the previous emails in another thread [1]. During this temporary
> > movement of confirm flush LSN, the slotsync worker fetches the two_phase_at
> > and confirm_flush_lsn values, leading to the assertion failure. We see this
> > issue intermittently because it depends on the timing of slotsync worker's
> > request to fetch the slot's value.
>
> Based on this theory, I can reproduce the BF failure in the 040 tap-test on
> HEAD after applying the 0001 patch. This is achieved by using the injection
> point to stop the walsender from sending a keepalive before receiving the old
> origin position from the apply worker, ensuring the confirmed_flush
> consistently moves backward before slotsync.
>
> Additionally, I've reproduced the duplicate data issue on HEAD without slotsync
> using the attached script (after applying the injection point patch). This
> issue arises if we immediately disable the subscription after the
> confirm_flush_lsn moves backward, preventing the walsender from advancing the
> confirm_flush_lsn.
>
> In this case, if a prepared transaction exists before two_phase_at, then after
> re-enabling the subscription, it will replicate that prepared transaction when
> decoding the PREPARE record and replicate that again when decoding the COMMIT
> PREPARED record. In such cases, the apply worker keeps reporting the error:
>
> ERROR: transaction identifier "pg_gid_16387_755" is already in use.
>
> Apart from above, we're investigating whether the same issue can occur in
> back-branches and will share the results once ready.
>
The issue was confirmed to occur on back branches as well, due to
confirmed_flush_lsn moving backward. It has now been fixed on HEAD and
all supported back-branches down to PG13.
For details, refer to the separate thread [1]; the fix was committed
(commit: ad5eaf3)[2].
The BF failure has not occurred since the fix, but we’ll continue to
keep an eye.
[1] https://www.postgresql.org/message-id/CAJpy0uDZ29P=BYB1JDWMCh-6wXaNqMwG1u1mB4=10Ly0x7HhwQ@mail.gmail.com
[2] https://github.com/postgres/postgres/commit/ad5eaf390c58294e2e4c1509aa87bf13261a5d15
--
Thanks,
Nisha
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2025-05-21 04:54:41 | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart |
Previous Message | Michael Paquier | 2025-05-21 04:43:11 | Re: Add comment explaining why queryid is int64 in pg_stat_statements |