Re: Multicolumn hash indexes

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, Tomasz Ostrowski <tometzky+pg(at)ato(dot)waw(dot)pl>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Multicolumn hash indexes
Date: 2017-09-27 15:59:13
Message-ID: CA+TgmoYEZAdh8+RDmOWHb_Miq46uMycNrCYdH78its7i6rZr7w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 27, 2017 at 11:57 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Wed, Sep 27, 2017 at 9:56 AM, Jesper Pedersen
>> <jesper(dot)pedersen(at)redhat(dot)com> wrote:
>>> Maybe an initial proof-of-concept could store the hash of the first column
>>> (col1) plus the hash of all columns (col1, col2, col3) in the index, and see
>>> what requirements / design decisions would appear from that.
>
>> I thought about that sort of thing yesterday but it's not that simple.
>> The problem is that the hash code isn't just stored; it's used to
>> assign tuples to buckets. If you have two hash codes, you have to
>> pick one of the other to use for assigning the tuple to a bucket. And
>> then if you want to search using the other hash code, you have to
>> search all of the buckets, which will stink.
>
> If we follow GIST's lead that the leading column is "most important",
> the idea could be to require a search constraint on the first column,
> which produces the hash that determines the bucket assignment. Hashes
> for additional columns would just be payload data in the index entry.
> If you have search constraint(s) on low-order column(s), you can check
> for hash matches before visiting the heap, but they don't reduce how
> much of the index you have to search. Even btree works that way for
> many combinations of incomplete index constraints.

I see. That makes sense.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bossart, Nathan 2017-09-27 16:20:43 Re: [Proposal] Allow users to specify multiple tables in VACUUM commands
Previous Message Tom Lane 2017-09-27 15:57:23 Re: Multicolumn hash indexes