From: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com> |
---|---|
To: | Michael Paquier <michael(at)paquier(dot)xyz>, Amit Kapila <akapila(at)postgresql(dot)org> |
Cc: | "pgsql-committers(at)lists(dot)postgresql(dot)org" <pgsql-committers(at)lists(dot)postgresql(dot)org> |
Subject: | RE: pgsql: Preserve conflict-relevant data during logical replication. |
Date: | 2025-07-24 02:57:10 |
Message-ID: | OS0PR01MB571653B3069C781746EFE7FE945EA@OS0PR01MB5716.jpnprd01.prod.outlook.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-committers |
On Thursday, July 24, 2025 9:25 AM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Wed, Jul 23, 2025 at 03:35:06AM +0000, Amit Kapila wrote:
> > Preserve conflict-relevant data during logical replication.
> >
> > Logical replication requires reliable conflict detection to maintain
> > data consistency across nodes. To achieve this, we must prevent
> > premature removal of tuples deleted by other origins and their
> > associated commit_ts data by VACUUM, which could otherwise lead to
> > incorrect conflict reporting and resolution.
>
> Some of the tests added by this commit are causing blurps in the CI:
> https://cfbot.cputube.org/highlights/all.html
>
> Example of one job (triggered it once myself with a separate patch):
> https://cirrus-ci.com/task/5397273342902272
> [21:15:03.131](0.028s) not ok 7 - altering retain_dead_tuples is allowed for
> disabled subscription
> [21:18:48.295](225.164s) # poll_query_until timed out executing this
> query:
> [21:18:48.296](0.000s) not ok 8 - the xmin value of slot 'pg_conflict_detection'
> is valid on Node A
> [21:18:48.306](0.010s) not ok 9 - warn of the possibility of receiving changes
> from origins other than the publisher
> [21:18:48.412](0.037s) not ok 11 - the deleted column is non-removable
> [21:22:29.286](220.874s) # poll_query_until timed out executing this
> query:
> [21:22:29.287](0.000s) not ok 12 - the xmin value of slot 'pg_conflict_detection'
> is updated on Node A
> [21:22:29.297](0.010s) not ok 13 - the deleted column is removed
>
> The failures happen on FreeBSD as far as I know, that enforces some rules not
> used elsewhere:
> -c debug_copy_parse_plan_trees=on
> -c debug_write_read_parse_plan_trees=on
> -c debug_raw_expression_coverage_test=on
> -c debug_parallel_query=regress
> [...]
> CPPFLAGS: -DRELCACHE_FORCE_RELEASE
> -DENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS
>
> I suspect that reusing these options would help in reproducing the problem.
> These are not commonly used in buildfarm animals, reducing the friction to
> make the instability show up.
Thanks for reporting the issue!
I confirmed that the test to enable the retain_dead_tuples option for a
disabled subscription failed due to the apply worker for that subscription still
running, which caused the all subsequent tests to fail. To resolve this issue,
we need to ensure the apply worker has stopped when disabling the subscription.
> 2025-07-23 21:15:03.128 UTC client backend[39133] 035_conflicts.pl LOG: statement: ALTER SUBSCRIPTION tap_sub_a_b SET (retain_dead_tuples = true);
> 2025-07-23 21:15:03.128 UTC client backend[39133] 035_conflicts.pl ERROR: cannot alter retain_dead_tuples when logical replication worker is still running
Attached is a patch to address this problem. Apart from the reported failure,
there's another place where we did not wait for the worker to stop after
disabling the subscription. Although this hasn't resulted in a test failure so
far, I added wait logic for it in the patch as well for safety.
Best Regards,
Hou zj
Attachment | Content-Type | Size |
---|---|---|
0001-Fix-Cfbot-failure.patch | application/octet-stream | 1.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2025-07-24 03:44:31 | Re: pgsql: Preserve conflict-relevant data during logical replication. |
Previous Message | Fujii Masao | 2025-07-24 02:45:12 | pgsql: doc: Add missing index entries and fix title formatting in pg_bu |