Quick Links

Re: Hash Function: MD5 or other?

From:	Peter Fein <pfein(at)pobox(dot)com>
To:	Bruno Wolff III <bruno(at)wolff(dot)to>
Cc:	pgsql-general(at)postgresql(dot)org
Subject:	Re: Hash Function: MD5 or other?
Date:	2005-06-14 20:54:50
Message-ID:	42AF441A.7080804@pobox.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Bruno Wolff III wrote:
> On Tue, Jun 14, 2005 at 08:33:34 -0500,
> Peter Fein <pfein(at)pobox(dot)com> wrote:
>
>>Knowing the specifics of the data I'm putting in sometext, a halfway
>>decent hash function would make collisions so rare as to make the chance
>>insignificant (and collisions wouldn't break anything anyway). Is this
>>approach reasonable, or should I use a hash index on (group_id,
>>sometext) - does this suffer from the same size limitation as btrees? I
>>thought hash indexes were slow...
>
>
> The hash value should be saved as a separate column. Then it sounds
> like you want a partial btree index of (group_id, hash) where the
> flag is set.

I'm unclear why I'd need to store the hash in a column. I suppose I
could have the hash column populated by a trigger on inserts, but this
seems to get me the same functionality & is less obvious.

For the archives, I did:

CREATE UNIQUE INDEX idx_md5_sometext ON mytable USING btree
(group_id, md5(sometext))
WHERE group_representative = true;

I then basically replicate this in a SELECT on the client side
(including calculating the MD5 by the client) to figure out the correct
value for group_representative before inserting a new row. This is the
only way I use the MD5, so I don't much care about retrieving it in
other contexts.

--
Peter Fein pfein(at)pobox(dot)com 773-575-0694

Basically, if you're not a utopianist, you're a schmuck. -J. Feldman

In response to

Re: Hash Function: MD5 or other? at 2005-06-14 20:18:15 from Bruno Wolff III

Responses

Re: Hash Function: MD5 or other? at 2005-06-14 21:27:30 from Bruno Wolff III

Browse pgsql-general by date

	From	Date	Subject
Next Message	Bruno Wolff III	2005-06-14 21:27:30	Re: Hash Function: MD5 or other?
Previous Message	Sophie Yang	2005-06-14 20:28:14	Re: Set Membership operator -- test group membership