024_add_drop_pub.pl might fail due to deadlock

From: Alexander Lakhin <exclusion(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: 024_add_drop_pub.pl might fail due to deadlock
Date: 2025-07-05 16:00:00
Message-ID: bab95e12-6cc5-4ebb-80a8-3e41956aa297@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello hackers,

The recent buildfarm failure [1] on REL_15_STABLE with the following
diagnostics:
# Looks like your test exited with 29 just after 1.
t/024_add_drop_pub.pl ..............
Dubious, test returned 29 (wstat 7424, 0x1d00)

pgsql.build/src/test/subscription/tmp_check/log/regress_log_024_add_drop_pub
[21:01:34.406](16.501s) ok 1 - check initial data is copied to subscriber
error running SQL: 'psql:<stdin>:1: ERROR:  deadlock detected
DETAIL:  Process 219632 waits for ExclusiveLock on relation 6000 of database 0; blocked by process 218369.
Process 218369 waits for AccessShareLock on object 16387 of class 6100 of database 0; blocked by process 219632.
HINT:  See server log for query details.'
while running 'psql -XAtq -d port=14957 host=/home/bf/bf-build/petalura/tmp/bGI6HuRtfa dbname='postgres' -f - -v
ON_ERROR_STOP=1' with sql 'ALTER SUBSCRIPTION tap_sub DROP PUBLICATION tap_pub_1' at
/home/bf/bf-build/petalura/REL_15_STABLE/pgsql.build/../pgsql/src/test/perl/PostgreSQL/Test/Cluster.pm line 1951.

pgsql.build/src/test/subscription/tmp_check/log/024_add_drop_pub_subscriber.log
2025-07-01 21:01:32.682 CEST [218369][logical replication worker][3/6:0] LOG:  logical replication apply worker for
subscription "tap_sub" has started
...
2025-07-01 21:01:34.771 CEST [219632][client backend][4/14:0] LOG: statement: ALTER SUBSCRIPTION tap_sub DROP
PUBLICATION tap_pub_1
2025-07-01 21:01:37.355 CEST [219632][client backend][4/14:731] ERROR:  deadlock detected
2025-07-01 21:01:37.355 CEST [219632][client backend][4/14:731] DETAIL:  Process 219632 waits for ExclusiveLock on
relation 6000 of database 0; blocked by process 218369.
    Process 218369 waits for AccessShareLock on object 16387 of class 6100 of database 0; blocked by process 219632.
    Process 219632: ALTER SUBSCRIPTION tap_sub DROP PUBLICATION tap_pub_1
    Process 218369: <command string not enabled>
2025-07-01 21:01:37.355 CEST [219632][client backend][4/14:731] HINT:  See server log for query details.
2025-07-01 21:01:37.355 CEST [219632][client backend][4/14:731] STATEMENT:  ALTER SUBSCRIPTION tap_sub DROP PUBLICATION
tap_pub_1

shows that the test can fail due to deadlock on accessing
pg_replication_origin (relation 6000).

This failure can be easily reproduced with:
--- a/src/backend/replication/logical/origin.c
+++ b/src/backend/replication/logical/origin.c
@@ -428,6 +428,7 @@ replorigin_drop_by_name(const char *name, bool missing_ok, bool nowait)
         * the specific origin and then re-check if the origin still exists.
         */
        rel = table_open(ReplicationOriginRelationId, ExclusiveLock);
+pg_usleep(300000);

Not reproduced on REL_16_STABLE (since f6c5edb8a), nor in v14- (because
024_add_drop_pub.pl was added in v15).

[1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=petalura&dt=2025-07-01%2018%3A00%3A58

Best regards,
Alexander

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2025-07-05 16:57:36 Re: A assert failure when initdb with track_commit_timestamp=on
Previous Message Fujii Masao 2025-07-05 15:18:43 Re: NegotiateProtocolVersion description