| From: | Kevin Brown <kevin(at)sysexperts(dot)com> |
|---|---|
| To: | pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: Questions about indexes? |
| Date: | 2003-02-17 19:27:35 |
| Message-ID: | 20030217192735.GO1833@filer |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Curt Sampson wrote:
> On Mon, 16 Feb 2003, Ryan Bradetich wrote:
> > Since my only requirement is that the rows be unique, I have developed a
> > custom MD5 function in C, and created an index on the MD5 hash of the
> > concatanation of all the fields.
>
> Well, that won't guarantee uniqueness, since it's perfectly possible
> to have two different rows hash to the same value. (If that weren't
> possible, your hash would have to contain as much information as the row
> itself, and your space savings wouldn't be nearly so dramatic.)
That's true, but even if he has 4 billion rows it drops the
probability of a duplicate down to something like one in 4 billion, so
it's probably a safe enough bet. His application doesn't require
absolute uniqueness, fortunately, so md5 works well enough in this
case.
Otherwise md5 wouldn't be a terribly good hash...
--
Kevin Brown kevin(at)sysexperts(dot)com
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Bruce Momjian | 2003-02-17 19:31:07 | Re: new Configuration patch, implements 'include' |
| Previous Message | mlw | 2003-02-17 19:26:08 | Re: new Configuration patch, implements 'include' |