Re: Concurrent deadlock scenario with DROP INDEX on partitioned index

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jimmy Yih <jyih(at)vmware(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Gaurab Dey <gaurabd(at)vmware(dot)com>
Subject: Re: Concurrent deadlock scenario with DROP INDEX on partitioned index
Date: 2022-03-20 18:39:39
Message-ID: 1534667.1647801579@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Jimmy Yih <jyih(at)vmware(dot)com> writes:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Actually though, maybe you *don't* want to do this in
>> RangeVarCallbackForDropRelation. Because of point 2, it might be
>> better to run find_all_inheritors after we've successfully
>> identified and locked the direct target relation, ie do it back
>> in RemoveRelations. I've not thought hard about that, but it's
>> attractive if only because it'd mean you don't have to fix point 1.

> We think that RangeVarCallbackForDropRelation is probably the most
> correct function to place the fix. It would look a bit out-of-place
> being in RemoveRelations seeing how there's already relative DROP
> INDEX code in RangeVarCallbackForDropRelation.

I really think you made the wrong choice here. Doing the locking in
RemoveRelations leads to an extremely simple patch, as I demonstrate
below. Moreover, since RemoveRelations also has special-case code
for partitioned indexes, it's hard to argue that it mustn't cover
this case too.

Also, I think the proposed test case isn't very good, because when
I run it without applying the code patch, it fails to demonstrate
any deadlock. The query output is different, but not obviously
wrong.

> Fixed in attached patch. Added another local variable
> is_partitioned_index to store the classform value. The main reason we
> need the classform is because the existing relkind and
> expected_relkind local variables would only show RELKIND_INDEX whereas
> we needed exactly RELKIND_PARTITIONED_INDEX.

Yeah. As I looked at that I realized that it was totally confusing:
at least one previous author thought that "relkind" stored the rel's
actual relkind, which it doesn't as the code stands. In particular,
in this bit:

if ((relkind == RELKIND_INDEX || relkind == RELKIND_PARTITIONED_INDEX) &&
relOid != oldRelOid)
{
state->heapOid = IndexGetRelation(relOid, true);

the test for relkind == RELKIND_PARTITIONED_INDEX is dead code
because relkind will never be that. It accidentally works anyway
because the other half of the || does the right thing, but somebody
was confused, and so will readers be.

Hence, I propose the attached. 0001 is pure refactoring: it hopefully
clears up the confusion about which "relkind" is which, and it also
saves a couple of redundant syscache fetches in RemoveRelations.
Then 0002 adds the actual bug fix as well as a test case that does
deadlock with unpatched code.

regards, tom lane

Attachment Content-Type Size
v4-0001-preliminary-refactoring.patch text/x-diff 5.2 KB
v4-0002-fix-concurrent-deadlock.patch text/x-diff 9.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2022-03-20 18:40:06 Re: support for MERGE
Previous Message Alvaro Herrera 2022-03-20 17:58:07 Re: a misbehavior of partition row movement (?)