Re: Use quick select instead of qsort to get median

From: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
To: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Use quick select instead of qsort to get median
Date: 2021-07-22 14:02:28
Message-ID: CAFBsxsEWrCzKs6yx3+9-1ehWwU2UwewS2D58P2NpZkDA1Tbv_w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 22, 2021 at 8:07 AM houzj(dot)fnst(at)fujitsu(dot)com <
houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> Hi,
>
> When I was writing an extension which need to get the median of an array,
I
> tried to find if postgres provide some api that can do that. I found all
the
> places in postgres invoke qsort() and then get the median. I was thinking
can
> we do better by using "quick select" and is it worth it.

> Attach a POC patch about this idea. I did some simple performance tests,
I can
> see about 10% performance gain in this test[2].
>
> Thoughts ?

> 1.
> entry_dealloc
> ...
> /* Record the (approximate) median usage */
> if (i > 0)
> pgss->cur_median_usage = entries[i / 2]->counters.usage;

It might be useful to be more precise here, but it seems it would be
slower, too?

> -----test create index
> drop index quad_box_tbl_idx;
> CREATE INDEX quad_box_tbl_idx ON quad_box_tbl USING spgist(b);
>
> -------test results
> PATCH:
> Time: 2609.664 ms (00:02.610)
>
> HEAD:
> Time: 2903.765 ms (00:02.944)

That index type is pretty rare, as far as I know. That doesn't seem to be
quite enough motivation to change the qsort template. If the goal was to
improve the speed of "create spgist index", would this still be the best
approach? Also, there are other things under consideration that would add
complexity to the qsort template [1], and this would add even more.

Looking in the docs [2], we don't have a MEDIAN aggregate, but we do have
percentile_disc(), and quick select might help there, but I haven't looked.

[1] https://commitfest.postgresql.org/33/3038/
[2] https://www.postgresql.org/docs/14/functions-aggregate.html
--
John Naylor
EDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-07-22 14:24:05 Re: printf %s with NULL pointer (was Re: BUG #17098: Assert failed on composing an error message when adding a type to an extension being dropped)
Previous Message Alvaro Herrera 2021-07-22 13:42:00 Re: Incorrect usage of strtol, atoi for non-numeric junk inputs