Re: [BUG?] strange behavior in ALTER TABLE ... RENAME TO on inherited columns

From: KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>, Bernd Helmle <mailings(at)oopsware(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Thom Brown <thombrown(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Subject: Re: [BUG?] strange behavior in ALTER TABLE ... RENAME TO on inherited columns
Date: 2010-02-01 03:04:53
Message-ID: 4B6644D5.9080009@ak.jp.nec.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

(2010/02/01 8:41), KaiGai Kohei wrote:
> (2010/01/30 3:36), Robert Haas wrote:
>> 2010/1/28 KaiGai Kohei<kaigai(at)ak(dot)jp(dot)nec(dot)com>:
>>> (2010/01/29 9:58), KaiGai Kohei wrote:
>>>> (2010/01/29 9:29), Robert Haas wrote:
>>>>> 2010/1/28 KaiGai Kohei<kaigai(at)ak(dot)jp(dot)nec(dot)com>:
>>>>>> (2010/01/29 0:46), Robert Haas wrote:
>>>>>>> 2010/1/27 KaiGai Kohei<kaigai(at)ak(dot)jp(dot)nec(dot)com>:
>>>>>>>> Hmm, indeed, this logic (V3/V5) is busted.
>>>>>>>>
>>>>>>>> The idea of V4 patch can also handle this case correctly, although it
>>>>>>>> is lesser in performance.
>>>>>>>> I wonder whether it is really unacceptable cost in performance, or not.
>>>>>>>> Basically, I assume ALTER TABLE RENAME/TYPE is not frequent operations,
>>>>>>>> and I don't think this bugfix will damage to the reputation of PostgreSQL.
>>>>>>>>
>>>>>>>> Where should we go on the next?
>>>>>>>
>>>>>>> Isn't the problem here just that the following comment is 100% wrong?
>>>>>>>
>>>>>>> /*
>>>>>>> * Unlike find_all_inheritors(), we need to walk on
>>>>>>> child relations
>>>>>>> * that have diamond inheritance tree, because this
>>>>>>> function has to
>>>>>>> * return correct expected inhecount to the caller.
>>>>>>> */
>>>>>>>
>>>>>>> It seems to me that the right solution here is to just add one more
>>>>>>> argument to find_all_inheritors(), something like List
>>>>>>> **expected_inh_count.
>>>>>>>
>>>>>>> Am I missing something?
>>>>>>
>>>>>> The find_all_inheritors() does not walk on child relations more than
>>>>>> two times, even if a child has multiple parents inherited from common
>>>>>> origin, because list_concat_unique_oid() ignores the given OID if it
>>>>>> is already on the list. It means all the child relations under the
>>>>>> relation already walked on does not checked anywhere. (Of course,
>>>>>> this assumption is correct for the purpose of find_all_inheritors()
>>>>>> with minimum cost.)
>>>>>>
>>>>>> What we want to do here is to compute the number of times a certain
>>>>>> child relation is inherited from a common origin; it shall be the
>>>>>> expected-inhcount. So, we need an arrangement to the logic.
>>>>>>
>>>>>> For example, see the following diagram.
>>>>>>
>>>>>> T2
>>>>>> / \
>>>>>> T1 T4---T5
>>>>>> \ /
>>>>>> T3
>>>>>>
>>>>>> If we call find_all_inheritors() with T1. The find_inheritance_children()
>>>>>> returns T2 and T3 for T1.
>>>>>> Then, it calls find_inheritance_children() for T2, and it returns T4.
>>>>>> Then, it calls find_inheritance_children() for T3, and it returns T4, but
>>>>>> it is already in the "rels_list", so list_concat_unique_oid() ignores it.
>>>>>> Then, it calls find_inheritance_children() for T4, and it returns T5.
>>>>>>
>>>>>> In this example, we want the expected inhcount for T2 and T3 should be 1,
>>>>>> for T4 and T5 should be 2. However, it walks on T4 and T5 only once, so
>>>>>> they will have 1 incorrectly.
>>>>>> Even if we count up the ignored OID (T4), find_all_inheritors() does not
>>>>>> walk on T5, because it is already walked on obviously when T4 is ignored.
>>>>>
>>>>> I think the count for T5 should be 1, and I think that the count for
>>>>> T4 can easily be made to be 2 by coding the algorithm correctly.
>>>>
>>>> Ahh, it is right. I was confused.
>>>>
>>>> Is it possible to introduce the logic mathematical-strictly?
>>>> Now I'm considering whether the find_all_inheritors() logic is suitable
>>>> for any cases, or not.
>>>
>>> I modified the logic to compute an expected inhcount of the child relations.
>>>
>>> What we want to compute here is to compute the number of times a certain
>>> relation being inherited directly from any relations delivered from a unique
>>> origin. If the column to be renamed is eventually inherited from a common
>>> origin, its attinhcount is not larger than the expected inhcount.
>>>
>>>> WITH RECURSIVE r AS (
>>>> SELECT 't1'::regclass AS inhrelid
>>>> UNION ALL
>>>> SELECT c.inhrelid FROM pg_inherits c, r WHERE r.inhrelid = c.inhparent
>>>> ) -- r is all the child relations inherited from 't1'
>>>> SELECT inhrelid::regclass, count(*) FROM pg_inherits
>>>> WHERE inhparent IN (SELECT inhrelid FROM r) GROUP BY inhrelid;
>>>
>>> The modified logic increments the expected inhcount of the relation already
>>> walked on. If we found a child relation twice or more, it means the child
>>> relation is at the junction of the inheritance tree.
>>> In this case, we don't walk on the child relations any more, but it is not
>>> necessary, because the first path already checked it.
>>>
>>> The find_all_inheritors() is called from several points more than renameatt(),
>>> so I added find_all_inheritors_with_inhcount() which takes argument of the
>>> expected inhcount list. And, find_all_inheritors() was also modified to call
>>> find_all_inheritors_with_inhcount() with NULL argument to avoid code duplication.
>>>
>>> It is the result of Berrnd's test case.
>>>
>>> - CVS HEAD
>>> 0.895s
>>> 0.903s
>>> 0.884s
>>> 0.896s
>>> 0.892s
>>>
>>> - with V6 patch
>>> 0.972s
>>> 0.941s
>>> 0.961s
>>> 0.949s
>>> 0.946s
>>
>> Well, that seems a lot better. Unfortunately, I can't commit this
>> code: it's mind-bogglingly ugly. I don't believe that duplicating
>> find_all_inheritors is the right solution (a point I've mentioned
>> previously), and it's certainly not right to use typeName->location as
>> a place to store an unrelated attribute inheritance count. There is
>> also at least one superfluous variable renaming; and the recursion
>> handling looks pretty grotty to me, too.
>>
>> I wonder if we should just leave this alone for 9.0 and revisit the
>> issue after doing some of the previously-proposed refactoring for 9.1.
>> The amount of time we're spending trying to fix this is somewhat out
>> of proportion to the importance of the problem.
>
> At least, I think we can fix the bug on renameatt() case in clean way,
> although we need an additional recursion handling in ATPrepAlterColumnType()
> to fix ALTER COLUMN TYPE cases.
>
> Should it focus on only the original renameatt() matter in this stage?

The attached patch modified find_all_inheritors() to return the list of
expected inhcount, if List * pointer is given. And, it focuses on only
the bugs in renameatt() case.

Also, the ALTER COLUMN TYPE case should be also fixed in the 9.1 release
(or 9.0.1?).

Thanks,
--
OSS Platform Development Division, NEC
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>

Attachment Content-Type Size
pgsql-fix-inherit-rename.7.patch application/octect-stream 18.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2010-02-01 03:14:54 Re: Pathological regexp match
Previous Message Fujii Masao 2010-02-01 02:49:04 Re: HS/SR and smart shutdown