Re: Concurrent deadlock scenario with DROP INDEX on partitioned index

From: Jimmy Yih <jyih(at)vmware(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Gaurab Dey <gaurabd(at)vmware(dot)com>
Subject: Re: Concurrent deadlock scenario with DROP INDEX on partitioned index
Date: 2022-03-16 18:20:10
Message-ID: BYAPR05MB6454BA1DC1F25BD4996F4D67BD119@BYAPR05MB6454.namprd05.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> 1. RangeVarCallbackForDropRelation can get called more than once
> during a lookup (in case of concurrent rename and suchlike).
> If so it needs to be prepared to drop the lock(s) it got last time.
> You have not implemented any such logic. This doesn't seem hard
> to fix, just store the OID list into struct DropRelationCallbackState.

Fixed in attached patch. We added the OID list to the
DropRelationCallbackState as you suggested.

> 2. I'm not sure you have fully thought through the interaction
> with the subsequent "is_partition" stanza. If the target is
> an intermediate partitioned index, don't we need to acquire lock
> on its parent index before starting to lock children? (It may
> be that that stanza is already in the wrong place relative to
> the table-locking stanza it currently follows, but not sure.)

It's not required to acquire lock on a possible parent index because
of the restriction where we can only run DROP INDEX on the top-most
partitioned index.

> 3. Calling find_all_inheritors with lockmode NoLock, and then
> locking those relation OIDs later, is both unsafe and code-wasteful.
> Just tell find_all_inheritors to get the locks you want.

Fixed in attached patch.

> 4. This code will invoke find_all_inheritors on InvalidOid if
> IndexGetRelation fails. It needs to be within the if-test
> just above.

Fixed in attached patch.

> 5. Reading classform again at this point in the function is
> not merely inefficient, but outright wrong, because we
> already released the syscache entry. Use the local variable.

Fixed in attached patch. Added another local variable
is_partitioned_index to store the classform value. The main reason we
need the classform is because the existing relkind and
expected_relkind local variables would only show RELKIND_INDEX whereas
we needed exactly RELKIND_PARTITIONED_INDEX.

> 6. You've not paid enough attention to updating existing comments,
> particularly the header comment for RangeVarCallbackForDropRelation.

Fixed in attached patch. Updated the header comment and aggregated our
introduced comment to another relative comment slightly above the
proposed locking section.

> Actually though, maybe you *don't* want to do this in
> RangeVarCallbackForDropRelation. Because of point 2, it might be
> better to run find_all_inheritors after we've successfully
> identified and locked the direct target relation, ie do it back
> in RemoveRelations. I've not thought hard about that, but it's
> attractive if only because it'd mean you don't have to fix point 1.

We think that RangeVarCallbackForDropRelation is probably the most
correct function to place the fix. It would look a bit out-of-place
being in RemoveRelations seeing how there's already relative DROP
INDEX code in RangeVarCallbackForDropRelation. With point 2 explained
and point 1 addressed, the fix seems to look okay in
RangeVarCallbackForDropRelation.

Thanks for the feedback. Attached new patch with feedback addressed.

--
Jimmy Yih (VMware)
Gaurab Dey (VMware)

Attachment Content-Type Size
v2-0001-Fix-concurrent-deadlock-scenario-with-DROP-INDEX-.patch application/octet-stream 11.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2022-03-16 18:36:56 Re: Optimize external TOAST storage
Previous Message Nathan Bossart 2022-03-16 17:42:28 Re: USE_BARRIER_SMGRRELEASE on Linux?