Re: wip: functions median and percentile

From: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, David Fetter <david(at)fetter(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: wip: functions median and percentile
Date: 2010-10-01 15:00:35
Message-ID: AANLkTi=fkJo59xfHYqe530O1Bdv0n01FQ8cuxAgOJxOt@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-rrreviewers

2010/10/1 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com> writes:
>> 2010/10/1 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>>> If this patch tries to force the entire sort to happen in memory,
>>> it is not committable.
>
>> What about array_agg()? Doesn't it exceed memory even if the huge data come in?
>
> Yeah, but for array_agg the user should be expecting a result of
> approximately the size of the whole input, so if he overruns memory he
> hasn't got a lot of room to complain.  There is no reason for a user to
> expect that median or percentile will fall over on large input, and
> every reason to expect them to be more robust than that.

So it's particular problem of *median* but not the idea of
on-memory-guaranteed tuplesort.

If this way is not committable, one of alternatives is to implement
median as a window function rather than an aggregate. But the big
problem of this is that it's impossible to have two
same-input-same-name functions of aggregate and window. AFAIK they are
ambiguous at parser stage. So we have to have median() for aggregate
and something like median_w() over (). This is worse idea, I feel.

Another way is to modify nodeWindowAgg in some way, but I cannot wrap
up how to. To call some destructor in the end of partition somehow,
but this is out of the current aggregate system.

The bottom-line is to throw an error from median in window aggregate,
but personally I would like to see median in window aggregate, which
is quite smart.

Another suggestion?

Regards,

--
Hitoshi Harada

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dimitri Fontaine 2010-10-01 15:06:39 Re: is sync rep stalled?
Previous Message Magnus Hagander 2010-10-01 14:59:11 Re: Patch author name on commitfest page

Browse pgsql-rrreviewers by date

  From Date Subject
Next Message Tom Lane 2010-10-01 15:08:00 Re: wip: functions median and percentile
Previous Message Robert Haas 2010-10-01 14:44:44 Re: wip: functions median and percentile