Skip site navigation (1) Skip section navigation (2)

Re: wip: functions median and percentile

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, David Fetter <david(at)fetter(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: wip: functions median and percentile
Date: 2010-10-13 10:36:59
Message-ID: AANLkTi=UH-KAkEqMPCe4oRnuptEYyp7uVSnjzw+nFUNA@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-hackerspgsql-rrreviewers
Hello

I am looking on SQL standard for some info about "within group"
clause. This clause is necessary for functions:

rank, dense_rank, cume_dist, percent_rank and percentile_disc and
persentile_cont. These functions needs a clause "WITHIN GROUP".

If I understand, then these functions are not simple aggregates - its
some between window functions and aggregates.

Questions:

* is clause "WITHIN GROUP" just syntactic sugar for our aggregate with
ORDER BY? I am thinking so not. There are predefined set of functions
that can be used with this clause.

* what is correct implementation of these functions?  When I am
looking on parameters, these functions are very similar to window
functions. So there are two two ways for implementation. Implement it
as special case of window functions or implement it as special case of
aggregates.

Regards

Pavel Stehule

2010/10/12 Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>:
> 2010/10/12 Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>:
>> Hello
>>
>> 2010/10/11 Greg Stark <gsstark(at)mit(dot)edu>:
>>> On Sun, Oct 10, 2010 at 2:16 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>>> It was pointed out upthread that while median isn't presently
>>>> in the standard, Oracle defines it in terms of percentile_cont(0.5)
>>>> which *is* in the standard.
>>>
>>> Uhmm, then why don't we implement that? We could provide median() as a
>>> short-cut but percentile_cont() doesn't sound much harder to implement
>>> than median() and more general.
>>
>> The problem is in interface. The original patch did it, but I removed
>> it. We cannot to unsure immutability of some parameters now. Can we
>> enhance a AGGREGATE to allow some mark like IMMUTABLE parameter and
>> probably we should to support ANSI syntax:
>>
>> PERCENTILE_CONT ( expression1 )
>> WITHIN GROUP ( ORDER BY expression2 [ ASC | DESC ] )
>>
>> This syntax allows to divide a muttable and immutable parameters.
>
> If this is only a syntax sugar for mutable/immutable parameter, then I
> guess it's time to take it serious to implement in our syntax,
> although I'm not sure if it affects more execution model than
> interface.
>
> Regards,
>
>
>
> --
> Hitoshi Harada
>

In response to

pgsql-hackers by date

Next:From: Vitalii TymchyshynDate: 2010-10-13 10:41:52
Subject: Re: Slow count(*) again...
Previous:From: Neil WhelchelDate: 2010-10-13 10:16:11
Subject: Re: Slow count(*) again...

pgsql-rrreviewers by date

Next:From: Peter EisentrautDate: 2010-10-13 10:48:17
Subject: Re: wip: functions median and percentile
Previous:From: Pavel StehuleDate: 2010-10-12 08:44:20
Subject: Re: wip: functions median and percentile

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group