Re: review: More frame options in window functions

From: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
To: Erik Rijkers <er(at)xs4all(dot)nl>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: review: More frame options in window functions
Date: 2010-01-16 21:26:57
Message-ID: e08cc0401001161326x14bdf398l6fa16186fb298e2a@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2010/1/17 Erik Rijkers <er(at)xs4all(dot)nl>:
> On Sat, January 16, 2010 09:29, Hitoshi Harada wrote:
>> 2010/1/16 Erik Rijkers <er(at)xs4all(dot)nl>:
>>>
>>>> Thanks for the review. I've found another crash today and attached is
>>>> fixed version. The case is:
>>>>
>>>> SELECT four, sum(ten) over (PARTITION BY four ORDER BY four RANGE 1
>>>> PRECEDING) FROM tenk1 WHERE unique1 < 10;
>>>>
>>>
>>> Hi,
>>>
>>> The patch (more_frame_options.20100115.patch.gz) applies cleanly, but the regression test gives:
>>
>> It doesn't happen here. Could you send more information like configure
>> options and EXPLAIN output of that query?
>>
>
> Sorry, I should have done that the first time.  Here is more info:
>
> Linux 2.6.18-164.el5 x86_64 / CentOS release 5.4 (Final)

Thanks, I confirmed it on my 64bit environment. My approach to solve
this was completely wrong.

The problem here is that when PARTITION BY clause equals to ORDER BY
clause window_pathkeys is canonicalized in make_pathkeys_for_window()
and so get_column_info_for_window() recognizes number of ORDER BY
columns as 0, in other word, window->ordNumCols = 0 &&
window->ordColIdx[0] is invalid. This behavior is ok in 8.4 because in
that case all rows are peer to each other and the partition = the
frame in RANGE mode. This assumption is now broken since
FRAMEOPTION_START_OFFSET and FRAMEOPTION_END_OFFSET are introduced,
which means that the current row can be out of the frame. So with
these options we must always evaluate if the current row is inside the
frame or not by *sort key column*. However, we don't have it in the
executor as the planner has removed it during canonicalization of
pathkeys.

So I previously added such code:
*** 2819,2825 **** get_column_info_for_window(PlannerInfo *root,
WindowClause *wc, List *tlist,
int numPart = list_length(wc->partitionClause);
int numOrder = list_length(wc->orderClause);

! if (numSortCols == numPart + numOrder)
{
/* easy case */
*partNumCols = numPart;
--- 2836,2847 ----
int numPart = list_length(wc->partitionClause);
int numOrder = list_length(wc->orderClause);

! /*
! * in RANGE offset mode, numOrder should be represented as-is.
! */
! if (numSortCols == numPart + numOrder ||
! (wc->frameOptions & FRAMEOPTION_RANGE &&
! wc->frameOptions & (FRAMEOPTION_START_VALUE | FRAMEOPTION_END_VALUE)))
{
/* easy case */
*partNumCols = numPart;

but it failed now, since the "sortColIdx" passed in has been
canonicalized already. And I tried to change not to canonicalize the
pathkeys in make_pathkeys_window() in such cases and succeeded then
passed all regression tests.

diff --git a/src/backend/optimizer/plan/planner.c
b/src/backend/optimizer/plan/planner.c
index 6ba051d..fee1302 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -1417,11 +1417,17 @@ grouping_planner(PlannerInfo *root, double
tuple_fraction)
int ordNumCols;
AttrNumber *ordColIdx;
Oid *ordOperators;
+ bool canonicalize;
+
+ canonicalize = !(wc->frameOptions & FRAMEOPTION_RANGE &&
+ wc->frameOptions &
+ (FRAMEOPTION_START_VALUE |
+ FRAMEOPTION_END_VALUE));

window_pathkeys = make_pathkeys_for_window(root,
wc,
tlist,
- true);
+ canonicalize);

/*
* This is a bit tricky: we build a sort node even if we don't

I am not very sure if this is the correct answer. Thoughts?

Regards,

--
Hitoshi Harada

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2010-01-16 22:49:29 Re: review: More frame options in window functions
Previous Message Kevin Grittner 2010-01-16 20:44:26 Re: Review: Patch: Allow substring/replace() to get/set bit values