From: | "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com> |
---|---|
To: | 'Ajin Cherian' <itsajin(at)gmail(dot)com> |
Cc: | vignesh C <vignesh21(at)gmail(dot)com>, Alexander Lakhin <exclusion(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Subject: | RE: 024_add_drop_pub.pl might fail due to deadlock |
Date: | 2025-07-23 09:12:08 |
Message-ID: | OSCPR01MB14966A3AA2A7823E8E52E0D54F55FA@OSCPR01MB14966.jpnprd01.prod.outlook.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Dear Ajin,
Thanks for the patch. Firstly let me confirm my understanding. While altering the
subscription, locks are acquired with below ordering:
Order target level
1 pg_subscription row exclusive
2 pg_subscription, my tuple access exclusive
3 pg_subscription_rel access exclusive
4 pg_replication_origin excluive
In contrast, apply worker works like:
Order target level
1 pg_replication_origin excluive
2 pg_subscription, my tuple access share
3 pg_subscrition_rel row exclusive
Thus a backend may wait at step 4, and apply worker can stuck at step 2 or 3.
Below are my comments:
```
@@ -340,7 +341,6 @@ process_syncing_tables_for_sync(XLogRecPtr current_lsn)
* is dropped. So passing missing_ok = false.
*/
ReplicationSlotDropAtPubNode(LogRepWorkerWalRcvConn, syncslotname, false);
-
```
This change is not needed.
```
+ if (!rel)
+ rel = table_open(SubscriptionRelRelationId, AccessExclusiveLock);
+
```
I feel it is too strong, isn't it enough to use row exclusive as initially used?
```
UpdateSubscriptionRelState(Oid subid, Oid relid, char state,
- XLogRecPtr sublsn)
+ XLogRecPtr sublsn, bool lock_needed)
```
I feel the name `lock_needed` is bit misleading, because the function still opens
the pg_subscription_rel catalog with row exclusive. How about "lock_shared_object"?
```
@@ -1460,7 +1460,13 @@ DropSubscription(DropSubscriptionStmt *stmt, bool isTopLevel)
* Note that the state can't change because we have already stopped both
* the apply and tablesync workers and they can't restart because of
* exclusive lock on the subscription.
+ *
+ * Lock pg_subscription_rel with AccessExclusiveLock to prevent any
+ * deadlock with apply workers of other subscriptions trying
+ * to drop tracking origin.
*/
+ sub_rel = table_open(SubscriptionRelRelationId, AccessExclusiveLock);
```
Hmm. Per my understanding, DropSubscription acquires below locks till it reaches
replorigin_drop_by_name().
Order target level
1 pg_subscription access exclusive
2 pg_subscription, my tuple access exclusive
3 pg_replication_origin excluive
IIUC we must preserve the ordering, not the target of locks.
Best regards,
Hayato Kuroda
FUJITSU LIMITED
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2025-07-23 09:26:57 | Re: 024_add_drop_pub.pl might fail due to deadlock |
Previous Message | Álvaro Herrera | 2025-07-23 09:08:23 | Re: RecoveryTargetAction is left out in xlog_interna.h |