Quick Links

Re: Concurrent deadlock scenario with DROP INDEX on partitioned index

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Jimmy Yih <jyih(at)vmware(dot)com>
Cc:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Gaurab Dey <gaurabd(at)vmware(dot)com>
Subject:	Re: Concurrent deadlock scenario with DROP INDEX on partitioned index
Date:	2022-03-10 23:54:23
Message-ID:	719207.1646956463@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Jimmy Yih <jyih(at)vmware(dot)com> writes:
> When dropping a partitioned index, the locking order starts with the
> partitioned table, then its partitioned index, then the partition
> indexes dependent on that partitioned index, and finally the dependent
> partition indexes' parent tables. This order allows a deadlock
> scenario to happen if for example an ANALYZE happens on one of the
> partition tables which locks the partition table and then blocks when
> it attempts to lock its index (the DROP INDEX has the index lock but
> cannot get the partition table lock).

I agree this is a bug, and I think that you may have the right
general idea about how to fix it: acquire the necessary locks in
RangeVarCallbackForDropRelation. However, this patch still needs
a fair amount of work.

1. RangeVarCallbackForDropRelation can get called more than once
during a lookup (in case of concurrent rename and suchlike).
If so it needs to be prepared to drop the lock(s) it got last time.
You have not implemented any such logic. This doesn't seem hard
to fix, just store the OID list into struct DropRelationCallbackState.

2. I'm not sure you have fully thought through the interaction
with the subsequent "is_partition" stanza. If the target is
an intermediate partitioned index, don't we need to acquire lock
on its parent index before starting to lock children? (It may
be that that stanza is already in the wrong place relative to
the table-locking stanza it currently follows, but not sure.)

3. Calling find_all_inheritors with lockmode NoLock, and then
locking those relation OIDs later, is both unsafe and code-wasteful.
Just tell find_all_inheritors to get the locks you want.

4. This code will invoke find_all_inheritors on InvalidOid if
IndexGetRelation fails. It needs to be within the if-test
just above.

5. Reading classform again at this point in the function is
not merely inefficient, but outright wrong, because we
already released the syscache entry. Use the local variable.

6. You've not paid enough attention to updating existing comments,
particularly the header comment for RangeVarCallbackForDropRelation.

Actually though, maybe you *don't* want to do this in
RangeVarCallbackForDropRelation. Because of point 2, it might be
better to run find_all_inheritors after we've successfully
identified and locked the direct target relation, ie do it back
in RemoveRelations. I've not thought hard about that, but it's
attractive if only because it'd mean you don't have to fix point 1.

regards, tom lane

In response to

Concurrent deadlock scenario with DROP INDEX on partitioned index at 2022-01-25 22:45:42 from Jimmy Yih

Responses

Re: Concurrent deadlock scenario with DROP INDEX on partitioned index at 2022-03-16 18:20:10 from Jimmy Yih

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Chapman Flack	2022-03-11 00:13:14	Re: Postgres restart in the middle of exclusive backup and the presence of backup_label file
Previous Message	Chapman Flack	2022-03-10 23:36:44	Re: trigger example for plsample