From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Neil Conway <neilc(at)samurai(dot)com> |
Cc: | pgsql-patches <pgsql-patches(at)postgresql(dot)org> |
Subject: | Re: Hash function for numeric (WIP) |
Date: | 2007-04-27 14:02:25 |
Message-ID: | 14201.1177682545@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-patches |
I wrote:
> I feel uncomfortable about this proposal because it will compute
> different hashes for values that differ only in having different
> numbers of trailing zeroes. Now the numeric.c code is supposed to
> suppress extra trailing zeroes on output, but that's never been a
> correctness property ... are we willing to make it one?
> There are various related cases involving unstripped leading zeroes.
> Another point is that sign = NUMERIC_NAN makes it a NAN regardless
> of any other fields; ignoring the sign does not get the right result
> here.
Something else I just remembered is that ndigits = 0 makes it a zero
regardless of the weight.
Perhaps a sufficiently robust way would be to form the hash as the
XOR of each supplied digit, circular-shifted by say 3 times the
digit's weight. This is insensitive to leading/trailing zeroes:
if (is NAN)
return -1; // or any other fixed value
hash = 0;
shift = 3 * weight;
for (i = 0; i < ndigits; i++)
{
thisshift = (shift & 31);
hash |= ((uint32) digit[i]) << thisshift;
if (thisshift > 0)
hash |= ((uint32) digit[i]) >> (32 - thisshift);
shift -= 3;
}
return hash;
That might look pretty ugly, but then again hash_any isn't especially
cheap.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2007-04-27 14:30:49 | Re: New version of GENERATED/IDENTITY, was Re: parser dilemma |
Previous Message | Heikki Linnakangas | 2007-04-27 08:44:16 | Re: [BUGS] BUG #3245: PANIC: failed to re-find shared lock object |