Re: BUG #3965: UNIQUE constraint fails on long column values

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Michael Fuhr" <mike(at)fuhr(dot)org>
Cc: "Francisco Olarte Sanz" <folarte(at)peoplecall(dot)com>, <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #3965: UNIQUE constraint fails on long column values
Date: 2008-02-21 11:07:58
Message-ID: 877igysh2p.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-docs

"Michael Fuhr" <mike(at)fuhr(dot)org> writes:

> On Wed, Feb 20, 2008 at 12:21:03PM +0100, Francisco Olarte Sanz wrote:
>> On Wednesday 20 February 2008, Gregory Stark wrote:
>>
>> > Unless you need cryptographic security I would not suggest using MD5. MD5
>> > is intentionally designed to take a substantial amount of CPU resources to
>> > calculate.
>>
>> I thought it was the exact opposite, quoting from RFC1321:

Hm, ok, strike "intentionally". Nonetheless MD5 is quite computationally
intensive compared to quick hashes like the ones Postgres uses or CRC hashes
(which we ought to have functions for, but we don't seem to). SHA-1 is even
more computationally intensive and SHA-256 far more again.

For purposes of speeding up access a simple hash with the possibility of a few
collisions is normally fine. You add an additional clause to recheck the
original constraint.

For purposes of enforcing uniqueness I would be leery of depending on any
hash. The decision would depend on the application and the consequences of a
spurious error. The chances are slim but it's not impossible.

> And if you *do* need cryptographic security then don't use MD5, and
> consider using SHA-256 instead of SHA-1. See RFC 4270 for discussion.
>
> ftp://ftp.rfc-editor.org/in-notes/rfc4270.txt

One of the factors in deciding between cryptographic algorithms is the
longevity required. MD5 has not been cracked but some suspicious weaknesses
have been discovered which might lead to a crack sometime in the future where
an attacker might be able to construct new plaintexts with identical hashes.
If you just need something secure for session keys then that's not going to be
a concern. If you need to distinguish user-provided documents from other
user-provided documents you're keeping for decades then it is.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's 24x7 Postgres support!

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Anna 2008-02-21 12:28:38 BUG #3976: Inteface direction issue
Previous Message Edwin Groothuis 2008-02-21 07:16:53 BUG #3975: tsearch2 index should not bomb out of 1Mb limit

Browse pgsql-docs by date

  From Date Subject
Next Message Peter Eisentraut 2008-02-26 15:33:33 Re: Missing docs for "FM" in to_char(numeric) ?
Previous Message Tom Lane 2008-02-20 16:04:45 Re: BUG #3965: UNIQUE constraint fails on long column values