Re: how to allow integer overflow for calculating hash code of a string?

From: Craig James <cjames(at)emolecules(dot)com>
To: Haifeng Liu <liuhaifeng(at)live(dot)com>
Cc: "pgsql-admin(at)postgresql(dot)org" <pgsql-admin(at)postgresql(dot)org>
Subject: Re: how to allow integer overflow for calculating hash code of a string?
Date: 2012-09-21 15:21:04
Message-ID: CAFwQ8rdvACJpQB=2y7UyUzxTq4XqA7mRQ3Nm4HxtFNFhFkeLrA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On Thu, Sep 20, 2012 at 7:56 PM, Haifeng Liu <liuhaifeng(at)live(dot)com> wrote:

>
> On Sep 20, 2012, at 10:34 PM, Craig James <cjames(at)emolecules(dot)com> wrote:
>
>
>
> On Thu, Sep 20, 2012 at 1:55 AM, Haifeng Liu <liuhaifeng(at)live(dot)com> wrote:
>
>> I want to write a hash function which acts as String.hashCode() in java:
>> hash = hash * 31 + s.charAt(i)... but I got integer out of range error. How
>> can I avoid this? I saw java do not care overflow of int, it just make the
>> result negative.
>>
>>
> Use the bitwise AND operator to mask the hash value with 0x3FFFFFF before
> each iteration:
>
> hash = (hash & 67108863) * 31 + s.charAt(i);
>
> Craig
>
>
> Thank you, I believe your solution is OK for a hash function, but I am
> aiming to create a hash function that is consistent with the one
> applications use. I know postgresql 9.1 has a hash function called
> hashtext, but I don't know what algorithm it use, and I also see that it's
> not recommended to relay on it. So I am trying to create a hash function
> which behaves exactly the same as java.lang.String.hashCode(). The later
> one may generate negative hash value. I guess when the number is
> overflowing, the part out of range will be ignored, and if the highest bit
> get 1, the hash value turn to negative value.
>

You are probably doing something where you want the application and the
database to implement the exact same function, but if you stick to the Java
built-in function, you will only have control over one implementation of
that function. What happens if someone working on Java changes the how the
Java internals work?

A better solution would be to implement your own hash function in Postgres,
and then once you know exactly how it will work, re-implement it in Java
with your own code. That's the only way you can ensure consistency between
the two.

Craig

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Scott Marlowe 2012-09-21 16:33:25 Re: Windows Services and Postgresql 9.1.3
Previous Message Anibal David Acosta 2012-09-21 15:06:44 Re: [ADMIN] Windows Services and Postgresql 9.1.3