Re: wip: functions median and percentile

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, David Fetter <david(at)fetter(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: wip: functions median and percentile
Date: 2010-10-01 13:19:57
Message-ID: AANLkTikEKnE9Q9ZOUNNZ9G2jsfWHrhBkTPRcU40WM7Qu@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-rrreviewers

2010/10/1 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com> writes:
>> 2010/9/26 Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>:
>>> This patch needs a few work - can share a compare functionality with
>>> tuplesort.c, but I would to verify a concept now.
>
>> Sorry for delay. I read the patch and it seems the result is sane. For
>> window function calls, I agree that the current tuplesort is not
>> enough to implement median functions and the patch introduces its own
>> memsort mechanism, although memsort has too much copied from
>> tuplesort. It looks to me not so difficult to modify the existing
>> tuplesort to guarantee staying in memory always if an option to do so
>> is specified from caller. I think that option can be used by other
>> cases in the core code.
>
> If this patch tries to force the entire sort to happen in memory,
> it is not committable.  What will happen when you get a lot of data?
> You need to be working on a variant that will work anyway, not working
> on an unacceptable lobotomization of the main sort code.

The median function checking a calling context - under window
aggregate uses a just memory sort solution - and under standard
aggregate context it uses a full tuplesort. It bases on request of
window aggregate function - the final function have to be called more
time - and it isn't possible with tuplesort. So as window aggregate it
uses just memory sort limited with work_mem. Other usage is unlimited.
Second option was a block median function under window aggregates.

It this design possible? We cannot use any complex source under window
aggregates, because there isn't any way to unlink it.

Regards

Pavel
>
>                        regards, tom lane
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2010-10-01 13:38:14 Re: git diff --patience
Previous Message Tom Lane 2010-10-01 13:05:43 Re: wip: functions median and percentile

Browse pgsql-rrreviewers by date

  From Date Subject
Next Message Hitoshi Harada 2010-10-01 14:11:08 Re: wip: functions median and percentile
Previous Message Tom Lane 2010-10-01 13:05:43 Re: wip: functions median and percentile