Re: Concurrent deadlock scenario with DROP INDEX on partitioned index

From: Zhihong Yu <zyu(at)yugabyte(dot)com>
To: Jimmy Yih <jyih(at)vmware(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Gaurab Dey <gaurabd(at)vmware(dot)com>
Subject: Re: Concurrent deadlock scenario with DROP INDEX on partitioned index
Date: 2022-03-16 19:48:09
Message-ID: CALNJ-vT_eRWsejyAMfD6h5LPKQTHpPzJLfNxtpczePookpFK2w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 16, 2022 at 11:20 AM Jimmy Yih <jyih(at)vmware(dot)com> wrote:

> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > 1. RangeVarCallbackForDropRelation can get called more than once
> > during a lookup (in case of concurrent rename and suchlike).
> > If so it needs to be prepared to drop the lock(s) it got last time.
> > You have not implemented any such logic. This doesn't seem hard
> > to fix, just store the OID list into struct DropRelationCallbackState.
>
> Fixed in attached patch. We added the OID list to the
> DropRelationCallbackState as you suggested.
>
> > 2. I'm not sure you have fully thought through the interaction
> > with the subsequent "is_partition" stanza. If the target is
> > an intermediate partitioned index, don't we need to acquire lock
> > on its parent index before starting to lock children? (It may
> > be that that stanza is already in the wrong place relative to
> > the table-locking stanza it currently follows, but not sure.)
>
> It's not required to acquire lock on a possible parent index because
> of the restriction where we can only run DROP INDEX on the top-most
> partitioned index.
>
> > 3. Calling find_all_inheritors with lockmode NoLock, and then
> > locking those relation OIDs later, is both unsafe and code-wasteful.
> > Just tell find_all_inheritors to get the locks you want.
>
> Fixed in attached patch.
>
> > 4. This code will invoke find_all_inheritors on InvalidOid if
> > IndexGetRelation fails. It needs to be within the if-test
> > just above.
>
> Fixed in attached patch.
>
> > 5. Reading classform again at this point in the function is
> > not merely inefficient, but outright wrong, because we
> > already released the syscache entry. Use the local variable.
>
> Fixed in attached patch. Added another local variable
> is_partitioned_index to store the classform value. The main reason we
> need the classform is because the existing relkind and
> expected_relkind local variables would only show RELKIND_INDEX whereas
> we needed exactly RELKIND_PARTITIONED_INDEX.
>
> > 6. You've not paid enough attention to updating existing comments,
> > particularly the header comment for RangeVarCallbackForDropRelation.
>
> Fixed in attached patch. Updated the header comment and aggregated our
> introduced comment to another relative comment slightly above the
> proposed locking section.
>
> > Actually though, maybe you *don't* want to do this in
> > RangeVarCallbackForDropRelation. Because of point 2, it might be
> > better to run find_all_inheritors after we've successfully
> > identified and locked the direct target relation, ie do it back
> > in RemoveRelations. I've not thought hard about that, but it's
> > attractive if only because it'd mean you don't have to fix point 1.
>
> We think that RangeVarCallbackForDropRelation is probably the most
> correct function to place the fix. It would look a bit out-of-place
> being in RemoveRelations seeing how there's already relative DROP
> INDEX code in RangeVarCallbackForDropRelation. With point 2 explained
> and point 1 addressed, the fix seems to look okay in
> RangeVarCallbackForDropRelation.
>
> Thanks for the feedback. Attached new patch with feedback addressed.
>
> --
> Jimmy Yih (VMware)
> Gaurab Dey (VMware)

Hi,
For RangeVarCallbackForDropRelation():

+ LockRelationOid(indexRelationOid, heap_lockmode);

Since the above is called for both if and else blocks, it can be lifted
outside.

Cheers

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-03-16 19:58:57 Re: Granting SET and ALTER SYSTE privileges for GUCs
Previous Message Jeff Davis 2022-03-16 19:42:53 Re: Proposal: Support custom authentication methods using hooks