| From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> | 
|---|---|
| To: | Jimmy Yih <jyih(at)vmware(dot)com> | 
| Cc: | "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Gaurab Dey <gaurabd(at)vmware(dot)com> | 
| Subject: | Re: Concurrent deadlock scenario with DROP INDEX on partitioned index | 
| Date: | 2022-03-20 18:39:39 | 
| Message-ID: | 1534667.1647801579@sss.pgh.pa.us | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
Jimmy Yih <jyih(at)vmware(dot)com> writes:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Actually though, maybe you *don't* want to do this in
>> RangeVarCallbackForDropRelation.  Because of point 2, it might be
>> better to run find_all_inheritors after we've successfully
>> identified and locked the direct target relation, ie do it back
>> in RemoveRelations.  I've not thought hard about that, but it's
>> attractive if only because it'd mean you don't have to fix point 1.
> We think that RangeVarCallbackForDropRelation is probably the most
> correct function to place the fix. It would look a bit out-of-place
> being in RemoveRelations seeing how there's already relative DROP
> INDEX code in RangeVarCallbackForDropRelation.
I really think you made the wrong choice here.  Doing the locking in
RemoveRelations leads to an extremely simple patch, as I demonstrate
below.  Moreover, since RemoveRelations also has special-case code
for partitioned indexes, it's hard to argue that it mustn't cover
this case too.
Also, I think the proposed test case isn't very good, because when
I run it without applying the code patch, it fails to demonstrate
any deadlock.  The query output is different, but not obviously
wrong.
> Fixed in attached patch. Added another local variable
> is_partitioned_index to store the classform value. The main reason we
> need the classform is because the existing relkind and
> expected_relkind local variables would only show RELKIND_INDEX whereas
> we needed exactly RELKIND_PARTITIONED_INDEX.
Yeah.  As I looked at that I realized that it was totally confusing:
at least one previous author thought that "relkind" stored the rel's
actual relkind, which it doesn't as the code stands.  In particular,
in this bit:
    if ((relkind == RELKIND_INDEX || relkind == RELKIND_PARTITIONED_INDEX) &&
        relOid != oldRelOid)
    {
        state->heapOid = IndexGetRelation(relOid, true);
the test for relkind == RELKIND_PARTITIONED_INDEX is dead code
because relkind will never be that.  It accidentally works anyway
because the other half of the || does the right thing, but somebody
was confused, and so will readers be.
Hence, I propose the attached.  0001 is pure refactoring: it hopefully
clears up the confusion about which "relkind" is which, and it also
saves a couple of redundant syscache fetches in RemoveRelations.
Then 0002 adds the actual bug fix as well as a test case that does
deadlock with unpatched code.
regards, tom lane
| Attachment | Content-Type | Size | 
|---|---|---|
| v4-0001-preliminary-refactoring.patch | text/x-diff | 5.2 KB | 
| v4-0002-fix-concurrent-deadlock.patch | text/x-diff | 9.7 KB | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Alvaro Herrera | 2022-03-20 18:40:06 | Re: support for MERGE | 
| Previous Message | Alvaro Herrera | 2022-03-20 17:58:07 | Re: a misbehavior of partition row movement (?) |