Re: store narrow values in hash indexes?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: store narrow values in hash indexes?
Date: 2016-09-23 20:16:24
Message-ID: 26868.1474661784@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> Another thought is that hash codes are 32 bits, but a Datum is 64 bits
> wide on most current platforms. So we're wasting 4 bytes per index
> tuple storing nothing.

Datum is not a concept that exists on-disk. What's stored is the 32-bit
hash value. You're right that we waste space if the platform's MAXALIGN
is 8, but that's the fault of the alignment requirement not the index
definition.

> If we generated 64-bit hash codes we could
> store as many bits of it as a Datum will hold and reduce hash
> collisions.

I think there is considerable merit in trying to move to 64-bit hash
codes (at least for data types that are more than 4 bytes to begin with),
but that's largely in the hope of reducing hash collisions in very large
indexes, not because we'd avoid wasting alignment pad space. If we do
support that, I think we would do it regardless of the platform MAXALIGN.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2016-09-23 20:17:19 Re: Hash Indexes
Previous Message Tom Lane 2016-09-23 20:04:32 Re: 9.6 TAP tests and extensions