Re: Next Steps with Hash Indexes

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Next Steps with Hash Indexes
Date: 2021-08-11 11:04:42
Message-ID: CAA4eK1JD1=nPDi0kDPGLC+JDGEYP8DgTanobvgve++KniQ68TA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 10, 2021 at 6:14 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Fri, Jul 23, 2021 at 6:16 PM Simon Riggs
> <simon(dot)riggs(at)enterprisedb(dot)com> wrote:
> >
> > On Thu, 22 Jul 2021 at 06:10, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> > Complete patch for hash_multicol.v3.patch attached, slightly updated
> > from earlier patch.
> > Docs, tests, passes make check.
>
> I was looking into the hash_multicoul.v3.patch, I have a question
>
> <para>
> - Hash indexes support only single-column indexes and do not allow
> - uniqueness checking.
> + Hash indexes support uniqueness checking.
> + Hash indexes support multi-column indexes, but only store the hash value
> + for the first column, so multiple columns are useful only for uniquness
> + checking.
> </para>
>
> The above comments say that we store hash value only for the first
> column, my question is why don't we store for other columns as well?
> I mean we can search the bucket based on the first column hash but the
> hashes for the other column could be payload data and we can use that
> to match the hash value for other key columns before accessing the
> heap, as discussed here[1]. IMHO, this will further reduce the heap
> access.
>

True, the other idea could be that in the payload we store the value
after 'combining multi-column hashes into one hash value'. This will
allow us to satisfy queries where the search is on all columns of the
index efficiently provided the planner doesn't remove some of them in
which case we need to do more work.

One more thing which we need to consider is 'hashm_procid' stored in
meta page, currently, it works for the single-column index but for the
multi-column index, we might want to set it as InvalidOid.

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-08-11 11:24:02 Re: How is this possible "publication does not exist"
Previous Message Andrew Dunstan 2021-08-11 10:10:14 Re: use-regular-expressions-to-simplify-less_greater-and-not_equals.patch