Re: Which qsort is used

From: "Dann Corbit" <DCorbit(at)connx(dot)com>
To: "Qingqing Zhou" <zhouqq(at)cs(dot)toronto(dot)edu>, "Greg Stark" <gsstark(at)mit(dot)edu>
Cc: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, "Luke Lonergan" <llonergan(at)greenplum(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Neil Conway" <neilc(at)samurai(dot)com>, "Bruce Momjian" <pgman(at)candle(dot)pha(dot)pa(dot)us>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Which qsort is used
Date: 2005-12-16 02:02:55
Message-ID: D425483C2C5C9F49B5B7A41F8944154757D37C@postal.corporate.connx.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> -----Original Message-----
> From: pgsql-hackers-owner(at)postgresql(dot)org [mailto:pgsql-hackers-
> owner(at)postgresql(dot)org] On Behalf Of Qingqing Zhou
> Sent: Thursday, December 15, 2005 3:16 PM
> To: Greg Stark
> Cc: Jim C. Nasby; Luke Lonergan; Tom Lane; Neil Conway; Bruce Momjian;
> pgsql-hackers(at)postgresql(dot)org
> Subject: Re: [HACKERS] Which qsort is used
>
>
>
> On Thu, 15 Dec 2005, Greg Stark wrote:
>
> >
> > I have a related question. qsort is only used in the postgres source
in
> a few
> > places. If postgres used an internal implementation instead of the
> library
> > source it could have implementations that don't use function
pointers.
> This
> > might perform faster for a few reasons. The comparator is much more
> likely to
> > be eligible for inlining for one.
> >
> > It also opens the door to using different sort algorithms for
different
> > applications. There may be some cases where the input is never
sorted
> and the
> > sample size is small so qsort is a good choice, and others where the
> inputs
> > can be large and using a better algorithm with worse overhead like
> mergesort
> > might make more sense.
> >
> > Unfortunately there isn't just a single qsort call though. I count 6
> > comparators in the source tree I have. So perhaps this is a
non-starter.
> > Having 6 qsort implementations around sounds pretty sketchy.
> >
>
> [also with reply to Tom] Both ideas look like doable (or at least
> testable) for me. I agree that the only interesting pot is in
tuplesort.c.
> For the 2nd idea, for smaller range, we may consider radix sort, which
is
> definitely faster - but this may need some work that enable query
> optimizer know the *exact* data range.

Radix sort can also be made completely generic by using a callback
function.

The function gives back n-bits at a time, from the most significant bits
to the least significant.

So, for char string, and a radix of 256, you just return the first char,
then the second char, etc. to get the classical radix sort.

Radix sort is no advantage for long, similar strings. Suppose that you
have a comparison sort that is O(n*log(n)). If n is one billion items,
then log2(n) is 32 and therefore LSD radix 256 sorts of 33 character
fields will be slower than a comparison sort, even for one billion
items.

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Christopher Kings-Lynne 2005-12-16 02:19:10 Re: Improving planning of outer joins
Previous Message Tom Lane 2005-12-16 00:06:39 Re: How much expensive are row level statistics?