From: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com> |
---|---|
To: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | shveta malik <shveta(dot)malik(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com> |
Subject: | RE: Conflict detection for update_deleted in logical replication |
Date: | 2025-09-11 08:59:05 |
Message-ID: | TY4PR01MB1690751D1CA8C128B0770EC6F9409A@TY4PR01MB16907.jpnprd01.prod.outlook.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Monday, September 8, 2025 7:21 PM Zhijie Hou (Fujitsu) <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Monday, September 8, 2025 3:13 PM Amit Kapila
> <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Fri, Sep 5, 2025 at 5:03 PM Zhijie Hou (Fujitsu)
> > <houzj(dot)fnst(at)fujitsu(dot)com>
> > wrote:
> > >
> > > Here are v2 patches which addressed above comments.
> > >
> >
> > I have pushed the first patch. I find that the test can't reliably fail without a fix.
> > Can you please investigate it?
>
> Thank you for catching this issue. I confirmed that the test may have tested
> VACCUM before slot.xmin was advanced. Therefore, to improve the test, I
> modified test to wait for the publisher's request message appearing twice, as
> after the fix, the apply worker should keep waiting for publisher status until the
> prepared txn is committed.
>
> Also, to reduce test time, I moved the test into the existing 035 test.
>
> Here is the updated test.
I noticed a BF failure[1] on this test. The log shows that the apply worker
advances the non-removable xid to the latest state before waiting for the
prepared transaction to commit. Upon reviewing the log, I didn't find any clues
of a bug in the code. One potential explanation is that the prepared transaction
hasn't reached the injection point before the apply worker requests the
publisher status.
The log lacks the timing for when the injection point is triggered and only
includes:
pub: 2025-09-11 03:40:05.667 CEST [396867][client backend][8/3:0] LOG: statement: COMMIT PREPARED 'txn_with_later_commit_ts';
..
sub: 2025-09-11 03:40:05.684 CEST [396798][logical replication apply worker][16/0:0] DEBUG: sending publisher status request message
Although the statement on the publisher appears before the publisher request,
the statement log is generated prior to command execution. Thus, it's possible
the injection point is triggered after responding to the publisher status.
After checking some other tap tests using injection points, most of them ensure
the injection is triggered before proceeding with the test (by waiting for the
wait event of injection point). We could also add this in the test:
$node_B->wait_for_event('client backend', 'commit-after-delay-checkpoint');
Here is a small patch.
Best Regards,
Hou zj
Attachment | Content-Type | Size |
---|---|---|
v1-0001-Fix-unstable-test-in-6456c6e.patch | application/octet-stream | 1.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Chao Li | 2025-09-11 09:08:31 | Re: GB18030-2022 Support in PostgreSQL |
Previous Message | Maxim Orlov | 2025-09-11 08:58:13 | Re: POC: make mxidoff 64 bits |