Re: median for postgresql 8.3

From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: maarten <maarten(dot)foque(at)edchq(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: median for postgresql 8.3
Date: 2010-11-17 15:31:35
Message-ID: AANLkTimcqphjdaOJvVwQzbjVF7J0Q_OwgwU4pNSN1iQL@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 16 November 2010 17:37, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> Hello
>
> see http://okbob.blogspot.com/2009/11/aggregate-function-median-in-postgresql.html
>

An 8.3-compatible way of doing it is:

SELECT CASE WHEN c % 2 = 0 AND c > 1 THEN (a[1]+a[2])/2 ELSE a[1] END
FROM
(
SELECT ARRAY(SELECT a FROM milrows ORDER BY a OFFSET (c-1)/2 LIMIT 2) AS a, c
FROM (SELECT count(*) AS c FROM milrows) AS count
OFFSET 0
)
AS midrows;

In my tests this is faster than the analytic and array-based methods,
but not by a huge amount.

Regards,
Dean

> Regards
>
> Pavel Stehule
>
> 2010/11/16 maarten <maarten(dot)foque(at)edchq(dot)com>:
>> Hello everyone,
>>
>> I was doing some analysis of data to find average delays between some
>> timestamp values etc...
>> When the number of rows the average is computed over is small, this can
>> give distorted values.  So I've obviously added a count column to see if
>> the average represents much data.
>> However, I would also like to add the median value to give me a pretty
>> good idea of whats happening even for smaller counts.
>>
>> I couldn't find such an aggregate function in the manual (version 8.3)
>> and some websearching didn't uncover it either.
>>
>> I was thinking about
>> SELECT max(id) FROM test ORDER BY id ASC LIMIT
>>        (SELECT count(*)/2 FROM test)
>>
>> But two things are wrong with that:
>> Limit can't use subqueries :(
>> And ORDER BY gives me the error: 'must be used in aggregate function
>> etc...) but I can probably work around this by using an ordered subquery
>> in stead of the table directly.
>>
>> Furthermore, I need the median for a timestamp column, which would
>> probably complicate things more than when it is a number column.
>>
>> I'd like to be able to do this using only the database. (So no
>> programming functions, special addons etc...)
>>
>> Any ideas anyone?
>>
>> regards,
>> Maarten
>>
>>
>> --
>> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-general
>>
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Derrick Rice 2010-11-17 15:36:52 Re: Survey on backing up unlogged tables: help us with PostgreSQL development!
Previous Message Tom Lane 2010-11-17 15:22:29 Re: where is pg_stat_activity (and others) in the documentation?