Skip site navigation (1) Skip section navigation (2)

Re: wip: functions median and percentile

From: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, David Fetter <david(at)fetter(dot)org>
Subject: Re: wip: functions median and percentile
Date: 2010-09-20 03:47:09
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-hackerspgsql-rrreviewers
2010/8/19 Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>:
> Hello
> I am sending a prototype implementation of functions median and
> percentile. This implementation is very simple and I moved it to
> contrib for this moment - it is more easy maintainable. Later I'll
> move it to core.

I've reviewed this patch.

* The patch can apply cleanly and make doesn't print any errors nor
warnings. But it doesn't touch contrib/Makefile so I had to make by
changing dir to contrib/median.

* Cosmetic coding style should be more appropriate, including trailing
white spaces and indentation positions.

* Since these two aggregates use tuplesort inside it, there're
possible risk to cause out of memory in normal use case. I don't think
this fact is critical, but at least some notation should be referred
in docs.

* It doesn't contain any document nor regression tests.

* They should be callable in window function context; for example

contrib_regression=# select median(a) over (order by a) from t1;
ERROR:  invalid tuplesort state

or at least user-friend message should be printed.

* The returned type is controversy. median(int) returns float8 as the
patch intended, but avg(int) returns numeric. AFAIK only avg(float8)
returns float8.

* percentile() is more problematic; first, the second argument for the
aggregate takes N of "N%ile" as int, like 50 if you want 50%ile,  but
it doesn't care about negative values or more than 100. In addition,
the second argument is taken at the first non-NULL value of the first
argument, but the second argument is semantically constant. For
example, for (1.. 10) value of a in table t1,

contrib_regression=# select percentile(a, a * 10 order by a) from t1;
(1 row)

contrib_regression=# select percentile(a, a * 10 order by a desc) from t1;
(1 row)

and if the argument comes from the subquery which doesn't contain
ORDER BY clause, you cannot predict the result.

That's all of my quick review up to now.


Hitoshi Harada

In response to


pgsql-hackers by date

Next:From: Bruce MomjianDate: 2010-09-20 03:47:35
Subject: Re: So where did that $Id$ come from?
Previous:From: Josh BerkusDate: 2010-09-20 02:31:52
Subject: Re: Configuring synchronous replication

pgsql-rrreviewers by date

Next:From: David FetterDate: 2010-09-20 05:11:02
Subject: Re: [HACKERS] wip: functions median and percentile
Previous:From: David FetterDate: 2010-09-17 16:53:47
Subject: client authentication hook

Privacy Policy | About PostgreSQL
Copyright © 1996-2018 The PostgreSQL Global Development Group