Re: [BUGS] BUG #4070: Join more then ~15 tables let postgreSQL produces wrong data

From: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
To: pgsql-patches <pgsql-patches(at)postgresql(dot)org>
Cc: "Ceschia, Marcello" <Marcello(dot)Ceschia(at)medizin(dot)uni-leipzig(dot)de>, PostgreSQL Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: [BUGS] BUG #4070: Join more then ~15 tables let postgreSQL produces wrong data
Date: 2008-04-03 13:22:34
Message-ID: 47F4DA1A.3000508@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-patches

Heikki Linnakangas wrote:
> Ceschia, Marcello wrote:
>> In query "query_not_working" all values from column "136_119" has the
>> value of the first column.
>>
>> Using the splitted query ("working_version") it works.
>>
>> I hope this data will help to find the bug.
>
> Thanks.
>
> Oh, the query actually gives an assertion failure on an
> assertion-enabled build, so this is clearly a bug:
>
> TRAP: FailedAssertion("!(attnum > 0 && attnum <=
> list_length(rte->joinaliasvars))", File: "parse_relation.c", Line: 1697)
>
> gdb tells that attnum is -31393 at that point. That's because
> get_rte_attribute_type() takes an AttrNumber, which is int16, and
> make_var() is trying to pass 34143, so it overflows.
>
> It seems we should extend AttrNumber to int32; we don't use AttrNumber
> in any of the on-disk structs. Though you still couldn't have more than
> MaxHeapAttributeNumber (1600) attributes in a table or
> MaxTupleAttributeNumber (1664) in a result set or intermediate tuples,
> like the output of a sort node, at least you could join ridiculously
> wide tables like that as long as you project out enough columns.

Attached is a self-contained test script to reproduce this. It produces
an assertion failure in 8.1 - CVS HEAD. On 8.0, it runs for ~5 minutes,
and finally produces an "ERROR: invalid varattno -32768" elog. On 7.4,
it runs for even longer, but returns the correct result in the end.
Looking at the code, I believe the same bug is present in 8.0 and 7.4 as
well, but is masked by something else in those releases.

On second thought, expanding AttrNumber to int32, wholesale, might not
be a good idea, because AttrNumber is used in the function signature of
TupleDescInitEntry and some other functions that might be used in
C-language user-defined functions. They would need to be recompiled. Is
this something to worry about?

Another approach is to track down all uses of AttrNumber where it's used
to refer to an entry in a target list (varattno), and change those to
plain ints. Attached is a patch to do that. This seems like a safer
approach, but I'm slightly worried that I might've missed some variables
that need to be changed.

Thoughts?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Attachment Content-Type Size
varattno-crash.sql text/x-sql 23.9 KB
varattno-int-1.patch text/x-diff 12.5 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2008-04-03 14:14:38 Re: [PATCHES] Re: BUG #4070: Join more then ~15 tables let postgreSQL produces wrong data
Previous Message Jeff Dwyer 2008-04-03 13:15:17 Re: BUG #4085: No implicit cast after coalesce

Browse pgsql-patches by date

  From Date Subject
Next Message Gregory Stark 2008-04-03 13:36:59 Re: psql command aliases support
Previous Message Peter Eisentraut 2008-04-03 12:57:05 Re: psql command aliases support