Re: Windowing Function Patch Review -> Standard Conformance

From: "Hitoshi Harada" <umi(dot)tanuki(at)gmail(dot)com>
To: "David Rowley" <dgrowley(at)gmail(dot)com>
Cc: heikki(dot)linnakangas(at)enterprisedb(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Windowing Function Patch Review -> Standard Conformance
Date: 2008-11-26 03:42:28
Message-ID: e08cc0400811251942y7ed803cav47470c61e9736280@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2008/11/26 David Rowley <dgrowley(at)gmail(dot)com>:
> Hitoshi Harada wrote:
>> 2008/11/20 David Rowley <dgrowley(at)gmail(dot)com>:
>> > -- The following query gives incorrect results on the
>> > -- maxhighbid column
>> >
>> > SELECT auctionid,
>> > category,
>> > description,
>> > highestbid,
>> > reserve,
>> > MAX(highestbid) OVER (ORDER BY auctionid) AS maxhighbid,
>> > MAX(reserve) OVER() AS highest_reserve
>> > FROM auction_items;
>> >
>> > If you remove the highest_reserve column you get the correct results.
>> >
>> > The bug also affects MIN, AVG, COUNT, STDDEV but not SUM.
>> Good report! I fixed this bug, which was by a trival missuse of
>> list_concat() in the planner. This was occurred when the first
>> aggregate trans func is strict and the second aggregate argument may
>> be null. Yep, the argument of the second was implicitly passed to the
>> first wrongly. That's why it didn't occur if you delete the second
>> MAX().
>>
>> I added a test case with the existing data emulating yours (named
>> "strict aggs") but if it is wrong, let me know.
>
>
> It's not quite right yet. I'm also getting regression tests failing on
> window. Let me know if you want the diffs.
>
> I've created a query that uses the table in your regression test.
> max_salary1 gives incorrect results. If you remove the max_salary2 column it
> gives the correct results.
>
> Please excuse the lack of sanity with the query. I had to do it this way to
> get 2 columns with NULLs.
>
>
> SELECT depname,
> empno,
> salary,
> salary1,
> salary2,
> MAX(salary1) OVER (ORDER BY empno) AS max_salary1,
> MAX(salary2) OVER() AS max_salary2
> FROM (SELECT depname,
> empno,
> salary,
> (CASE WHEN salary < 5000 THEN NULL ELSE salary END) AS salary1,
> (CASE WHEN salary >= 5000 THEN NULL ELSE salary END) AS salary2
> FROM empsalary
> ) empsalary;
>
> Actual results:
>
> depname | empno | salary | salary1 | salary2 | max_salary1 | max_salary2
> -----------+-------+--------+---------+---------+-------------+-------------
> sales | 1 | 5000 | 5000 | | | 4800
> personnel | 2 | 3900 | | 3900 | | 4800
> sales | 3 | 4800 | | 4800 | | 4800
> sales | 4 | 4800 | | 4800 | | 4800
> personnel | 5 | 3500 | | 3500 | | 4800
> develop | 7 | 4200 | | 4200 | | 4800
> develop | 8 | 6000 | 6000 | | | 4800
> develop | 9 | 4500 | | 4500 | | 4800
> develop | 10 | 5200 | 5200 | | | 4800
> develop | 11 | 5200 | 5200 | | | 4800
>
>
> Correct results:
>
> depname | empno | salary | salary1 | salary2 | max_salary1 | max_salary2
> -----------+-------+--------+---------+---------+-------------+-------------
> sales | 1 | 5000 | 5000 | | 5000 | 4800
> personnel | 2 | 3900 | | 3900 | 5000 | 4800
> sales | 3 | 4800 | | 4800 | 5000 | 4800
> sales | 4 | 4800 | | 4800 | 5000 | 4800
> personnel | 5 | 3500 | | 3500 | 5000 | 4800
> develop | 7 | 4200 | | 4200 | 5000 | 4800
> develop | 8 | 6000 | 6000 | | 6000 | 4800
> develop | 9 | 4500 | | 4500 | 6000 | 4800
> develop | 10 | 5200 | 5200 | | 6000 | 4800
> develop | 11 | 5200 | 5200 | | 6000 | 4800
>
>
> This might be a good regression test once it's fixed.
>

Hmm, did you apply the latest patch correctly? My build can produce
right results, so I don't see why it isn't fixed. Make sure the lines
around 2420-2430 in planner.c like:
/*
* must copyObject() to avoid args concatenating with each other.
*/
pulled_exprs = list_concat(pulled_exprs, copyObject(wfunc->args));

where copyObject() is added.

I'm not sure if this is related, another bug is found:

*** a/src/backend/nodes/equalfuncs.c
--- b/src/backend/nodes/equalfuncs.c
***************
*** 2246,2251 **** equal(void *a, void *b)
--- 2246,2252 ----
break;
case T_Aggref:
retval = _equalAggref(a, b);
+ break;
case T_WFunc:
retval = _equalWFunc(a, b);
break;

don't laugh at... could you try it and test again?

If whole the new patch is needed, tell me then will send it.

> I'm at a bit of a loss to what to do now. Should I wait for your work
> Heikki? Or continue validating this patch?
>
> The best thing I can think to do right now is continue and any problems I
> find you can add regression tests for, then if we keep your regression tests
> for Heikki's changes then we can validate those changes more quickly.
>
> Any thoughts? Better ideas?

Thanks to your great tests, we now know much more about specification
and where to fail easily, so continuing makes sense but it may be good
time to take a rest and wait for Heikki's patch completing.

Regards,

--
Hitoshi Harada

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2008-11-26 04:40:29 Re: [WIP] In-place upgrade
Previous Message Robert Haas 2008-11-26 03:41:11 Re: Column reordering in pg_dump