Re: store narrow values in hash indexes?

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: store narrow values in hash indexes?
Date: 2016-09-26 01:51:51
Message-ID: CA+TgmoaZ4cWuZX=GWiWPfGmpLSX6GCay6e2G4dmBp656jqu4rw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Sep 24, 2016 at 1:03 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Sat, Sep 24, 2016 at 1:02 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> Currently, hash indexes always store the hash code in the index, but
>> not the actual Datum. It's recently been noted that this can make a
>> hash index smaller than the corresponding btree index would be if the
>> column is wide. However, if the index is being built on a fixed-width
>> column with a typlen <= sizeof(Datum), we could store the original
>> value in the hash index rather than the hash code without using any
>> more space. That would complicate the code, but I bet it would be
>> faster: we wouldn't need to set xs_recheck, we could rule out hash
>> collisions without visiting the heap, and we could support index-only
>> scans in such cases.
>
> What exactly you mean by Datum? Is it for datatypes that fits into 64
> bits like integer.

Yeah, I mean whatever is small enough to fit into the space currently
being used to store the hashcode, along with any accompanying padding
bytes that we can also use.

> I think if we are able to support index only scans
> for hash indexes for some data types, that will be a huge plus.
> Surely, there is some benefit without index only scans as well, which
> is we can avoid recheck, but not sure if that alone can give us any
> big performance boost. As, you say, it might lead to some
> complication in code, but I think it is worth trying.

Yeah, the recheck is probably not that expensive if we have to
retrieve the heap page anyway.

> Won't it add some requirements for pg_upgrade as well?

I have nothing to add to what Bruce already said.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2016-09-26 02:18:54 Re: pgsql: pg_ctl: Detect current standby state from pg_control
Previous Message Thomas Munro 2016-09-26 01:05:37 Re: VACUUM's ancillary tasks