Skip site navigation (1) Skip section navigation (2)

Re: wip: functions median and percentile

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, David Fetter <david(at)fetter(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: wip: functions median and percentile
Date: 2010-10-01 14:44:44
Message-ID: AANLkTinejbxAwrAsHxVWCOs0pyFdm-s5AXpJZ-GuWwY7@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-hackerspgsql-rrreviewers
On Fri, Oct 1, 2010 at 10:11 AM, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com> wrote:
> 2010/10/1 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>> Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com> writes:
>>> 2010/9/26 Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>:
>>>> This patch needs a few work - can share a compare functionality with
>>>> tuplesort.c, but I would to verify a concept now.
>>
>>> Sorry for delay. I read the patch and it seems the result is sane. For
>>> window function calls, I agree that the current tuplesort is not
>>> enough to implement median functions and the patch introduces its own
>>> memsort mechanism, although memsort has too much copied from
>>> tuplesort. It looks to me not so difficult to modify the existing
>>> tuplesort to guarantee staying in memory always if an option to do so
>>> is specified from caller. I think that option can be used by other
>>> cases in the core code.
>>
>> If this patch tries to force the entire sort to happen in memory,
>> it is not committable.  What will happen when you get a lot of data?
>> You need to be working on a variant that will work anyway, not working
>> on an unacceptable lobotomization of the main sort code.
>
> What about array_agg()? Doesn't it exceed memory even if the huge data come in?

So, if you have 512MB of RAM in the box and you build and return a 1GB
array, it's going to be a problem.  Period, full stop.  The interim
memory consumption cannot be less than the size of the final result.

If you have 512MB of RAM in the box and you want to aggregate 1GB of
data and return a 4 byte integer, it's only a problem if your
implementation is bad.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

In response to

pgsql-hackers by date

Next:From: Robert HaasDate: 2010-10-01 14:47:11
Subject: Re: I: About "Our CLUSTER implementation is pessimal" patch
Previous:From: Tom LaneDate: 2010-10-01 14:43:47
Subject: Re: wip: functions median and percentile

pgsql-rrreviewers by date

Next:From: Hitoshi HaradaDate: 2010-10-01 15:00:35
Subject: Re: wip: functions median and percentile
Previous:From: Tom LaneDate: 2010-10-01 14:43:47
Subject: Re: wip: functions median and percentile

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group