Re: [BUG?] strange behavior in ALTER TABLE ... RENAME TO on inherited columns

From: KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>, Bernd Helmle <mailings(at)oopsware(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Thom Brown <thombrown(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Subject: Re: [BUG?] strange behavior in ALTER TABLE ... RENAME TO on inherited columns
Date: 2010-01-29 03:02:22
Message-ID: 4B624FBE.1060702@ak.jp.nec.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

(2010/01/29 9:58), KaiGai Kohei wrote:
> (2010/01/29 9:29), Robert Haas wrote:
>> 2010/1/28 KaiGai Kohei<kaigai(at)ak(dot)jp(dot)nec(dot)com>:
>>> (2010/01/29 0:46), Robert Haas wrote:
>>>> 2010/1/27 KaiGai Kohei<kaigai(at)ak(dot)jp(dot)nec(dot)com>:
>>>>> Hmm, indeed, this logic (V3/V5) is busted.
>>>>>
>>>>> The idea of V4 patch can also handle this case correctly, although it
>>>>> is lesser in performance.
>>>>> I wonder whether it is really unacceptable cost in performance, or not.
>>>>> Basically, I assume ALTER TABLE RENAME/TYPE is not frequent operations,
>>>>> and I don't think this bugfix will damage to the reputation of PostgreSQL.
>>>>>
>>>>> Where should we go on the next?
>>>>
>>>> Isn't the problem here just that the following comment is 100% wrong?
>>>>
>>>> /*
>>>> * Unlike find_all_inheritors(), we need to walk on
>>>> child relations
>>>> * that have diamond inheritance tree, because this
>>>> function has to
>>>> * return correct expected inhecount to the caller.
>>>> */
>>>>
>>>> It seems to me that the right solution here is to just add one more
>>>> argument to find_all_inheritors(), something like List
>>>> **expected_inh_count.
>>>>
>>>> Am I missing something?
>>>
>>> The find_all_inheritors() does not walk on child relations more than
>>> two times, even if a child has multiple parents inherited from common
>>> origin, because list_concat_unique_oid() ignores the given OID if it
>>> is already on the list. It means all the child relations under the
>>> relation already walked on does not checked anywhere. (Of course,
>>> this assumption is correct for the purpose of find_all_inheritors()
>>> with minimum cost.)
>>>
>>> What we want to do here is to compute the number of times a certain
>>> child relation is inherited from a common origin; it shall be the
>>> expected-inhcount. So, we need an arrangement to the logic.
>>>
>>> For example, see the following diagram.
>>>
>>> T2
>>> / \
>>> T1 T4---T5
>>> \ /
>>> T3
>>>
>>> If we call find_all_inheritors() with T1. The find_inheritance_children()
>>> returns T2 and T3 for T1.
>>> Then, it calls find_inheritance_children() for T2, and it returns T4.
>>> Then, it calls find_inheritance_children() for T3, and it returns T4, but
>>> it is already in the "rels_list", so list_concat_unique_oid() ignores it.
>>> Then, it calls find_inheritance_children() for T4, and it returns T5.
>>>
>>> In this example, we want the expected inhcount for T2 and T3 should be 1,
>>> for T4 and T5 should be 2. However, it walks on T4 and T5 only once, so
>>> they will have 1 incorrectly.
>>> Even if we count up the ignored OID (T4), find_all_inheritors() does not
>>> walk on T5, because it is already walked on obviously when T4 is ignored.
>>
>> I think the count for T5 should be 1, and I think that the count for
>> T4 can easily be made to be 2 by coding the algorithm correctly.
>
> Ahh, it is right. I was confused.
>
> Is it possible to introduce the logic mathematical-strictly?
> Now I'm considering whether the find_all_inheritors() logic is suitable
> for any cases, or not.

I modified the logic to compute an expected inhcount of the child relations.

What we want to compute here is to compute the number of times a certain
relation being inherited directly from any relations delivered from a unique
origin. If the column to be renamed is eventually inherited from a common
origin, its attinhcount is not larger than the expected inhcount.

> WITH RECURSIVE r AS (
> SELECT 't1'::regclass AS inhrelid
> UNION ALL
> SELECT c.inhrelid FROM pg_inherits c, r WHERE r.inhrelid = c.inhparent
> ) -- r is all the child relations inherited from 't1'
> SELECT inhrelid::regclass, count(*) FROM pg_inherits
> WHERE inhparent IN (SELECT inhrelid FROM r) GROUP BY inhrelid;

The modified logic increments the expected inhcount of the relation already
walked on. If we found a child relation twice or more, it means the child
relation is at the junction of the inheritance tree.
In this case, we don't walk on the child relations any more, but it is not
necessary, because the first path already checked it.

The find_all_inheritors() is called from several points more than renameatt(),
so I added find_all_inheritors_with_inhcount() which takes argument of the
expected inhcount list. And, find_all_inheritors() was also modified to call
find_all_inheritors_with_inhcount() with NULL argument to avoid code duplication.

It is the result of Berrnd's test case.

- CVS HEAD
0.895s
0.903s
0.884s
0.896s
0.892s

- with V6 patch
0.972s
0.941s
0.961s
0.949s
0.946s

Thanks,
--
OSS Platform Development Division, NEC
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>

Attachment Content-Type Size
pgsql-fix-inherit-rename.6.patch application/octect-stream 19.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Glaesemann 2010-01-29 03:37:39 Re: Pathological regexp match
Previous Message Alvaro Herrera 2010-01-29 03:00:50 Re: remove contrib/xml2