Re: Which qsort is used

From: "Dann Corbit" <DCorbit(at)connx(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Qingqing Zhou" <zhouqq(at)cs(dot)toronto(dot)edu>, "Greg Stark" <gsstark(at)mit(dot)edu>, "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, "Luke Lonergan" <llonergan(at)greenplum(dot)com>, "Neil Conway" <neilc(at)samurai(dot)com>, "Bruce Momjian" <pgman(at)candle(dot)pha(dot)pa(dot)us>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Which qsort is used
Date: 2005-12-16 08:05:41
Message-ID: D425483C2C5C9F49B5B7A41F8944154757D37F@postal.corporate.connx.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> -----Original Message-----
> From: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
> Sent: Thursday, December 15, 2005 6:24 PM
> To: Dann Corbit
> Cc: Qingqing Zhou; Greg Stark; Jim C. Nasby; Luke Lonergan; Neil
Conway;
> Bruce Momjian; pgsql-hackers(at)postgresql(dot)org
> Subject: Re: [HACKERS] Which qsort is used
>
> "Dann Corbit" <DCorbit(at)connx(dot)com> writes:
> > Radix sort can also be made completely generic by using a callback
> > function.
> > The function gives back n-bits at a time, from the most significant
bits
> > to the least significant.
>
> That's mighty handwavy --- it assumes that the datatype permits a
simple
> breakdown into small pieces that can be sorted lexicographically.

It's not so hard. For fixed length character strings, the mapping is
just the character. For integers the mapping is obvious [msb to lsb or
lsb to msb of the integer, depending on whether you are doing msd or lsd
radix sort]. For intel floating point, the transformation is:

#include <assert.h>
#include "inteltyp.h"

uint32
float2key(float f)
{
uint32 sign,
mant,
mask;

assert(sizeof(float) == sizeof(uint32));
mant = *(uint32 *) & f; /* Load float as array of bits */
sign = mant & SB_MASK32; /* Isolate the leading sign bit */
mant ^= SB_MASK32; /* Invert the sign bit, making + > - */
mask = sign - (sign >> 31); /* Either 0 or 0x7fffffff */
mant ^= mask; /* Invert exp and mant if negative */
return mant;
}

uint64
double2key(double d)
{
uint64 sign,
mant,
mask;

assert(sizeof(double) == sizeof(uint64));
mant = *(uint64 *) & d; /* Load float as array of bits */
sign = mant & SB_MASK64; /* Isolate the leading sign bit */
mant ^= SB_MASK64; /* Invert the sign bit, making + > - */
mask = sign - (sign >> 63); /* Either 0 or 0x7fffffffffffffff */
mant ^= mask; /* Invert exp and mant if negative */
return mant;
}

Where the contents of inteltyp.h are as follows:
/*
** Typdefs for Intel formats.
** See keyxfrm.c for usage.
*/

typedef unsigned long uint32;
#define SB_MASK32 0x80000000UL

#ifdef _MSC_VER
typedef unsigned __int64 uint64;
typedef __int64 sint64;
#define SB_MASK64 0x8000000000000000ui64
#else
typedef unsigned long long uint64;
typedef long long sint64;
#define SB_MASK64 0x8000000000000000ULL
#endif
extern uint32 float2key(float f);
uint64 double2key(double d);

=======================================================
After the above transformation, you just use the same buckets as for
integers.

In general, the creation of the mapping function is no more difficult
than the creation of a comparison function.

> Seems
> to me that's a much stronger requirement than assuming that you can
tell
> which of two whole values is smaller. What's worse, it needs to be
the
> same number of pieces for every value, which makes it hard to deal
with
> variable-length data.

No. The number of pieces is irrelevant. And you can use MSD radix sort
for variable length data.

> > So, for char string, and a radix of 256, you just return the first
char,
> > then the second char, etc. to get the classical radix sort.
>
> Uh, no, you'd need to work right-to-left, after having padded all the
> strings to the same length somehow.

Unless you use MSD radix sort (which is usually better anyway).

> regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2005-12-16 13:17:25 Re: How much expensive are row level statistics?
Previous Message Qingqing Zhou 2005-12-16 05:53:13 Re: Which qsort is used