Re: Next Steps with Hash Indexes

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Next Steps with Hash Indexes
Date: 2021-08-10 12:44:21
Message-ID: CAFiTN-tJ6vbN3oxrtv4Ak9mo8yvQjHMt4cwXFNCjwHQqOyN9wA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 23, 2021 at 6:16 PM Simon Riggs
<simon(dot)riggs(at)enterprisedb(dot)com> wrote:
>
> On Thu, 22 Jul 2021 at 06:10, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:

> Complete patch for hash_multicol.v3.patch attached, slightly updated
> from earlier patch.
> Docs, tests, passes make check.

I was looking into the hash_multicoul.v3.patch, I have a question

<para>
- Hash indexes support only single-column indexes and do not allow
- uniqueness checking.
+ Hash indexes support uniqueness checking.
+ Hash indexes support multi-column indexes, but only store the hash value
+ for the first column, so multiple columns are useful only for uniquness
+ checking.
</para>

The above comments say that we store hash value only for the first
column, my question is why don't we store for other columns as well?
I mean we can search the bucket based on the first column hash but the
hashes for the other column could be payload data and we can use that
to match the hash value for other key columns before accessing the
heap, as discussed here[1]. IMHO, this will further reduce the heap
access.

[1] https://www.postgresql.org/message-id/7192.1506527843%40sss.pgh.pa.us

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2021-08-10 13:19:00 Re: Worth using personality(ADDR_NO_RANDOMIZE) for EXEC_BACKEND on linux?
Previous Message torikoshia 2021-08-10 12:22:49 Re: RFC: Logging plan of the running query