Re: BUG #5025: Aggregate function with subquery in 8.3 and 8.4.

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, "Sheng Y(dot) Cheng" <scheng(at)adconion(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #5025: Aggregate function with subquery in 8.3 and 8.4.
Date: 2009-09-02 00:34:45
Message-ID: 13466.1251851685@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

I blithely opined:
> I believe BTW that there are related issues in other places where we
> expand composites into RowExprs. But the other places have been doing
> that for awhile. I think that for 8.4 our goals should be limited to
> not changing the behavior compared to prior releases.

So while I was testing my fix for this, I found out that that's more
complicated than I thought. Consider these examples in the regression
database:

select t1.q2, count(t2.*)
from int8_tbl t1 left join int8_tbl t2 on (t1.q2 = t2.q1)
group by t1.q2 order by 1;

select t1.q2, count(t2.*)
from int8_tbl t1 left join (select * from int8_tbl) t2 on (t1.q2 = t2.q1)
group by t1.q2 order by 1;

select t1.q2, count(t2.*)
from int8_tbl t1 left join (select * from int8_tbl offset 0) t2 on (t1.q2 = t2.q1)
group by t1.q2 order by 1;

select t1.q2, count(t2.*)
from int8_tbl t1 left join
(select q1, case when q2=1 then 1 else q2 end as q2 from int8_tbl) t2
on (t1.q2 = t2.q1)
group by t1.q2 order by 1;

If you believe that "t2.*" should go to NULL in a join-extended row,
then the correct answer for all four of these is

q2 | count
-------------------+-------
-4567890123456789 | 0
123 | 2
456 | 0
4567890123456789 | 6
(4 rows)

However, the actual behavior of every release since 8.0 has been that
the second case gives

q2 | count
-------------------+-------
-4567890123456789 | 1
123 | 2
456 | 1
4567890123456789 | 6
(4 rows)

ie, t2.* fails to go to NULL because it's expanded as ROW(t2.q1,t2.q2).
The OFFSET 0 in the third case restores expected behavior by preventing
flattening of the subquery, and up till 8.4 the CASE expression in the
fourth case did too.

With the fix I was just about to apply, all four cases give the first
set of results. This clearly satisfies the principle of least
astonishment, at least more nearly than what we have; but it equally
clearly is *not* going to restore 8.4 to work just like 8.3.

I'm inclined to apply the patch to 8.4 anyway, because it seems like a
bug fix. I would consider patching further back except there's no
chance of making it work in older branches, at least not without
destabilizing them quite a bit (the PlaceHolderVar mechanism would have
to be back-ported).

It might be possible to fix the older branches by not flattening
subqueries that have whole-row references; but even that would take
nontrivial work, and it'd be sacrificing performance to fix a corner
case no one has previously complained about. So I'm leaning to patching
8.4 and leaving the older branches alone.

Thoughts?

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2009-09-02 02:01:31 Re: pg_ctl infinite loop and memory leak
Previous Message Jeff Davis 2009-09-01 23:48:06 Re: BUG #5028: CASE returns ELSE value always when type is "char"