Re: qsort again (was Re: [PERFORM] Strange Create Index

From: Ron <rjpeace(at)earthlink(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>,pgsql-performance(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: qsort again (was Re: [PERFORM] Strange Create Index
Date: 2006-02-16 00:57:51
Message-ID: 7.0.1.0.2.20060215194635.03b55da0@earthlink.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

This behavior is consistent with the pivot choosing algorithm
assuming certain distribution(s) for the data. For instance,
median-of-three partitioning is known to be pessimal when the data is
geometrically or hyper-geometrically distributed. Also, care must be
taken that sometimes is not when there are many equal values in the
data. Even pseudo random number generator based pivot choosing
algorithms are not immune if the PRNG is flawed in some way.

How are we choosing our pivots?

At 06:28 PM 2/15/2006, Tom Lane wrote:

>I did some experimentation comparing the qsort from Fedora Core 4
>(glibc-2.3.5-10.3) with our src/port/qsort.c. For those who weren't
>following the pgsql-performance thread, the test case is just this
>repeated a lot of times:
>
>create table atest(i int4, r int4);
>insert into atest (i,r) select generate_series(1,100000), 0;
>insert into atest (i,r) select generate_series(1,100000), random()*100000;
>\timing
>create index idx on atest(r);
>\timing
>drop table atest;
>
>I did this 100 times and sorted the reported runtimes. (Investigation
>with trace_sort = on confirms that the runtime is almost entirely spent
>in qsort() called from our performsort --- the Postgres overhead is
>about 100msec on this machine.) Results are below.
>
>It seems clear that our qsort.c is doing a pretty awful job of picking
>qsort pivots, while glibc is mostly managing not to make that mistake.
>I haven't looked at the glibc code yet to see what they are doing
>differently.
>
>I'd say this puts a considerable damper on my enthusiasm for using our
>qsort all the time, as was recently debated in this thread:
>http://archives.postgresql.org/pgsql-hackers/2005-12/msg00610.php
>We need to fix our qsort.c before pushing ahead with that idea.
>
> regards, tom lane
>
>
>100 runtimes for glibc qsort, sorted ascending:
>
>Time: 459.860 ms
><snip>
>Time: 488.852 ms
>Time: 514.639 ms
>Time: 529.287 ms
>Time: 612.185 ms
>Time: 660.748 ms
>Time: 742.227 ms
>Time: 866.814 ms
>Time: 1234.848 ms
>Time: 1267.398 ms
>
>
>100 runtimes for port/qsort.c, sorted ascending:
>
>Time: 418.905 ms
><snip>
>Time: 20865.979 ms
>Time: 21000.907 ms
>Time: 21297.585 ms
>Time: 21714.518 ms
>Time: 25423.235 ms
>Time: 27543.052 ms
>Time: 28314.182 ms
>Time: 29400.278 ms
>Time: 34142.534 ms

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2006-02-16 00:59:44 Re: qsort again (was Re: [PERFORM] Strange Create Index behaviour)
Previous Message Tom Lane 2006-02-16 00:17:00 Re: qsort again (was Re: [PERFORM] Strange Create Index behaviour)

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2006-02-16 00:59:44 Re: qsort again (was Re: [PERFORM] Strange Create Index behaviour)
Previous Message Tom Lane 2006-02-16 00:17:00 Re: qsort again (was Re: [PERFORM] Strange Create Index behaviour)