Re: BUG #4916: wish: more statistical functions (median, percentiles etc)

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Richard Neill <rn214(at)cam(dot)ac(dot)uk>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #4916: wish: more statistical functions (median, percentiles etc)
Date: 2009-07-11 06:34:12
Message-ID: 162867790907102334r71db0227jfa0e4bd96f48b8e4@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-general

2009/7/11 Richard Neill <rn214(at)cam(dot)ac(dot)uk>:
>
> The following bug has been logged online:
>
> Bug reference:      4916
> Logged by:          Richard Neill
> Email address:      rn214(at)cam(dot)ac(dot)uk
> PostgreSQL version: 8.4
> Operating system:   Linux
> Description:        wish: more statistical functions (median, percentiles
> etc)
> Details:
>
> In addition to the existing aggregate functions (avg, stddev etc),  it would
> be nice if postgres could return further information. For example, the
> quartiles, percentiles, and median.
>
> [mode would also be useful, as an explicit function, though we can get it
> easily enough using count(1) order by count desc].
>
> According to google, this has been a wish since at least year 2000 for
> various people, but doesn't seem to be implemented.
>
> Thanks - Richard

hello

you can use following tricks:

create or replace function nth_percentil(anyarray, int)
returns anyelement as $$
select $1[$2/100.0 * array_upper($1,1) + 1];
$$ language sql immutable strict;

pagila=# select nth_percentil(array(select length from film order by 1),90);
nth_percentil
---------------
173
(1 row)

pagila=# select count(case when length < 173 then 1 end)::float /
count(*) * 100.0
from film;
?column?
----------
89.6
(1 row)

create or replace function median(anyarray)
returns float as $$
select ($1[round(array_upper($1,1)/2.0)] + $1[array_upper($1,1) -
round(array_upper($1,1)/2.0) + 1]) / 2.0::float;
$$ language sql immutable strict;

pagila=# select median(array[1,2]), median(array[1,2,3]),
median(array[1,2,3,4]);
median | median | median
--------+--------+--------
1.5 | 2 | 2.5
(1 row)

pagila=# select median(array(select length from film order by 1));
median
--------
114
(1 row)

tested on pagila database

regards
Pavel Stehule
>
> --
> Sent via pgsql-bugs mailing list (pgsql-bugs(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-bugs
>

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Jasen Betts 2009-07-11 14:12:42 Re: BUG #4916: wish: more statistical functions (median, percentiles etc)
Previous Message Joshua Tolley 2009-07-11 01:21:00 Re: BUG #4916: wish: more statistical functions (median, percentiles etc)

Browse pgsql-general by date

  From Date Subject
Next Message Dennis Gearon 2009-07-11 07:21:39 Date Time Arithmetic Speed
Previous Message Peter Hunsberger 2009-07-11 02:56:09 Re: Using Postgres to store genetic data