From: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | shveta malik <shveta(dot)malik(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com> |
Subject: | RE: Conflict detection for update_deleted in logical replication |
Date: | 2025-09-16 03:53:38 |
Message-ID: | TY4PR01MB16907A3A6CD76B087C17B82949414A@TY4PR01MB16907.jpnprd01.prod.outlook.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Monday, September 15, 2025 8:11 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Mon, Sep 15, 2025 at 1:07 PM Zhijie Hou (Fujitsu)
> <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> >
> > Thanks, the changes look good to me. I have merged them in V75 patch.
> >
>
> Pushed. I see a BF which is not related with this commit but a previous commit
> for the work in this thread.
>
> See LOGs [1]:
> regress_log_035_conflicts
> -----------------------------------
> [11:16:47.604](0.015s) not ok 24 - the deleted column is removed
> [11:16:47.605](0.002s) # Failed test 'the deleted column is removed'
> # at
> /home/bf/bf-build/kestrel/HEAD/pgsql/src/test/subscription/t/035_conflict
> s.pl
> line 562.
>
> Then the corresponding subscriber LOG:
>
> 025-09-15 11:16:47.600 CEST [1888170][client backend][1/13:0] INFO:
> vacuuming "postgres.public.tab"
> 2025-09-15 11:16:47.600 CEST [1888170][client backend][1/13:0] INFO:
> finished vacuuming "postgres.public.tab": index scans: 0
> pages: 0 removed, 1 remain, 1 scanned (100.00% of total), 0 eagerly scanned
> tuples: 0 removed, 1 remain, 0 are dead but not yet removable tuples missed: 1
> dead from 1 pages not removed due to cleanup lock contention removable
> cutoff: 787, which was 0 XIDs old when operation ended ...
>
> This indicates that the Vacuum is not able to remove the row even after the slot
> is advanced because some other background process holds a lock/pin on the
> page. I think that is possible because the page was dirty due to apply of update
> operation and bgwriter/checkpointer could try to write such a page.
>
> I'll analyze more tomorrow and share if I have any new findings.
I agree with the analysis. I attempted to delete a tuple from a table and, while
executing VACUUM(verbose) on this table, I executed a checkpoint concurrently.
Using the debugger, I stoped in SyncOneBuffer() after acquiring the page block.
This allowed me to reproduce the same log where the deleted row could not be
removed:
--
tuples: 0 removed, 1 remain, 0 are dead but not yet removable
tuples missed: 1 dead from 1 pages not removed due to cleanup lock contention
--
I think we can remove the VACUUM for removing the deleted column. We have
already confirmed that the replication slot.xmin has advanced, which should be
sufficient to prove that the feature works correctly.
Best Regards,
Hou zj
Attachment | Content-Type | Size |
---|---|---|
v1-0001-Stablize-the-tests-in-035_conflicts.patch | application/octet-stream | 2.1 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2025-09-16 03:56:17 | Re: Reword messages using "as" instead of "because" |
Previous Message | Kyotaro Horiguchi | 2025-09-16 02:46:44 | Reword messages using "as" instead of "because" |