Skip site navigation (1) Skip section navigation (2)

Re: wip: functions median and percentile

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, David Fetter <david(at)fetter(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: wip: functions median and percentile
Date: 2010-10-01 19:17:08
Message-ID: AANLkTinZwWHnssyER3Hi+L=V42O9sF43r_Duw7UL9H=6@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-hackerspgsql-rrreviewers
Hello

updated version
  * memsort removed
  * window aggregate support blocked

Regards

Pavel

2010/10/1 Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>:
> 2010/10/1 Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>:
>> 2010/10/2 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>>> Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com> writes:
>>>> 2010/10/2 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>>>>> The implementation I would've expected to see is to do the sort and then
>>>>> have two code paths for retrieving the median, depending on whether the
>>>>> sort result is all in memory or not.
>>>
>>>> Hm? The problem we encountered in the middle of the patch is there is
>>>> no chance to call tuplesort_end if median is called in moving frame
>>>> window aggregate because final function is called multiple times
>>>> during moving.
>>>
>>> Well, if you haven't got a solution for that, then this patch isn't
>>> ready for prime time.
>>>
>>> It's entirely possible that median as a window function is intractable.
>>> I'd rather have it throwing error than offer an implementation that will
>>> fall over as soon as the window gets large.
>>
>> Well, that sounds like the conclusion. It is a shame, but we have to
>> throw an error from median() in the window aggregate, if Pavel does
>> not have any better solution. And as an aggregate function only, the
>> patch is ready if the window-related parts are removed.
>>
>
> I am sorry - I don't have a better solution. Classic algorithm isn't
> well for window aggregate - it needs a sort after any append a new
> item. Maybe we can use a separate functionality based on estimated
> values for a windows. I read some articles about it. But this is work
> on longer time - all articles about this topic are experimental. More
> I am not mathematician - so I am not able to review these methods.
> Today or tomorrow I'll send a updated patch without support a window
> aggregates.
>
> Regards
>
> Pavel Stehule
>
>> Regards,
>>
>>
>> --
>> Hitoshi Harada
>>
>

Attachment: median.diff
Description: application/octet-stream (16.2 KB)

In response to

Responses

pgsql-hackers by date

Next:From: Tom LaneDate: 2010-10-01 19:25:08
Subject: Re: timestamp_in DirectFunctionCall
Previous:From: Tom LaneDate: 2010-10-01 19:14:14
Subject: Re: So git pull is shorthand for what exactly?

pgsql-rrreviewers by date

Next:From: Jaime CasanovaDate: 2010-10-01 20:42:54
Subject: Re: Patch reviewers
Previous:From: ๏̯͡๏ Guido BarosioDate: 2010-10-01 17:48:11
Subject: Patch reviewers

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group