Re: [PERFORM] "Hash index" vs. "b-tree index" (PostgreSQL

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Neil Conway <neilc(at)samurai(dot)com>, "Jim C(dot) Nasby" <decibel(at)decibel(dot)org>, Christopher Petrilli <petrilli(at)gmail(dot)com>, Ying Lu <ying_lu(at)cs(dot)concordia(dot)ca>, pgsql-general(at)postgresql(dot)org, pgsql-performance(at)postgresql(dot)org
Subject: Re: [PERFORM] "Hash index" vs. "b-tree index" (PostgreSQL
Date: 2005-05-10 13:53:18
Message-ID: 9871.1115733198@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-performance

Greg Stark <gsstark(at)mit(dot)edu> writes:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
>> I think that efficient implementation of this would require explicitly
>> storing the hash code for each index entry,

> It seems that means doubling the size of the hash index. That's a pretty big
> i/o to cpu tradeoff.

Hardly. The minimum possible size of a hash entry today is 8 bytes
header plus 4 bytes datum, plus there's a 4-byte line pointer to factor
in. So under the most pessimistic assumptions, storing the hash code
would add 25% to the size. (On MAXALIGN=8 hardware, it might cost you
nothing at all.)

> What if the hash index stored *only* the hash code? That could be useful for
> indexing large datatypes that would otherwise create large indexes.

Hmm, that could be a thought.

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Vlad 2005-05-10 13:58:38 Re: Data Modelling Tools
Previous Message Ian Harding 2005-05-10 13:52:25 Re: PostGreSQL -> SQL server

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2005-05-10 14:16:20 Re: Prefetch
Previous Message Matt Olson 2005-05-10 13:52:51 Prefetch