Skip site navigation (1) Skip section navigation (2)

Re: wip: functions median and percentile

From: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, David Fetter <david(at)fetter(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: wip: functions median and percentile
Date: 2010-10-01 15:00:35
Message-ID: AANLkTi=fkJo59xfHYqe530O1Bdv0n01FQ8cuxAgOJxOt@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-hackerspgsql-rrreviewers
2010/10/1 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com> writes:
>> 2010/10/1 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>>> If this patch tries to force the entire sort to happen in memory,
>>> it is not committable.
>
>> What about array_agg()? Doesn't it exceed memory even if the huge data come in?
>
> Yeah, but for array_agg the user should be expecting a result of
> approximately the size of the whole input, so if he overruns memory he
> hasn't got a lot of room to complain.  There is no reason for a user to
> expect that median or percentile will fall over on large input, and
> every reason to expect them to be more robust than that.

So it's particular problem of *median* but not the idea of
on-memory-guaranteed tuplesort.

If this way is not committable, one of alternatives is to implement
median as a window function rather than an aggregate. But the big
problem of this is that it's impossible to have two
same-input-same-name functions of aggregate and window. AFAIK they are
ambiguous at parser stage. So we have to have median() for aggregate
and something like median_w() over (). This is worse idea, I feel.

Another way is to modify nodeWindowAgg in some way, but I cannot wrap
up how to. To call some destructor in the end of partition somehow,
but this is out of the current aggregate system.

The bottom-line is to throw an error from median in window aggregate,
but personally I would like to see median in window aggregate, which
is quite smart.

Another suggestion?

Regards,


-- 
Hitoshi Harada

In response to

Responses

pgsql-hackers by date

Next:From: Dimitri FontaineDate: 2010-10-01 15:06:39
Subject: Re: is sync rep stalled?
Previous:From: Magnus HaganderDate: 2010-10-01 14:59:11
Subject: Re: Patch author name on commitfest page

pgsql-rrreviewers by date

Next:From: Tom LaneDate: 2010-10-01 15:08:00
Subject: Re: wip: functions median and percentile
Previous:From: Robert HaasDate: 2010-10-01 14:44:44
Subject: Re: wip: functions median and percentile

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group