Re: Hash Indexes

From: Kenneth Marshall <ktm(at)rice(dot)edu>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz>, Andres Freund <andres(at)anarazel(dot)de>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hash Indexes
Date: 2016-09-19 15:16:21
Message-ID: 20160919151621.GB12277@aart.rice.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Sep 19, 2016 at 12:14:26PM +0530, Amit Kapila wrote:
> On Mon, Sep 19, 2016 at 11:20 AM, Mark Kirkwood
> <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz> wrote:
> >
> >
> > On 17/09/16 06:38, Andres Freund wrote:
> >>
> >> On 2016-09-16 09:12:22 -0700, Jeff Janes wrote:
> >>>
> >>> On Thu, Sep 15, 2016 at 7:23 AM, Andres Freund <andres(at)anarazel(dot)de>
> >>> wrote:
> >>>>
> >>>> One earlier question about this is whether that is actually a worthwhile
> >>>> goal. Are the speed and space benefits big enough in the general case?
> >>>> Could those benefits not be achieved in a more maintainable manner by
> >>>> adding a layer that uses a btree over hash(columns), and adds
> >>>> appropriate rechecks after heap scans?
> >>>>
> >>>> Note that I'm not saying that hash indexes are not worthwhile, I'm just
> >>>> doubtful that question has been explored sufficiently.
> >>>
> >>> I think that exploring it well requires good code. If the code is good,
> >>> why not commit it?
> >>
> >> Because getting there requires a lot of effort, debugging it afterwards
> >> would take effort, and maintaining it would also takes a fair amount?
> >> Adding code isn't free.
> >>
> >> I'm rather unenthused about having a hash index implementation that's
> >> mildly better in some corner cases, but otherwise doesn't have much
> >> benefit. That'll mean we'll have to step up our user education a lot,
> >> and we'll have to maintain something for little benefit.
> >>
> >
> > While I see the point of what you are saying here, I recall all previous
> > discussions about has indexes tended to go a bit like this:
> >
> > - until WAL logging of hash indexes is written it is not worthwhile trying
> > to make improvements to them
> > - WAL logging will be a lot of work, patches 1st please
> >
> > Now someone has done that work, and we seem to be objecting that because
> > they are not improved then the patches are (maybe) not worthwhile.
> >
>
> I think saying hash indexes are not improved after proposed set of
> patches is an understatement. The read performance has improved by
> more than 80% as compare to HEAD [1] (refer data in Mithun's mail).
> Also, tests by Mithun and Jesper has indicated that in multiple
> workloads, they are better than BTREE by 30~60% (in fact Jesper
> mentioned that he is seeing 40~60% benefit on production database,
> Jesper correct me if I am wrong.). I agree that when index column is
> updated they are much worse than btree as of now, but no work has been
> done improve it and I am sure that it can be improved for those cases
> as well.
>
> In general, I thought the tests done till now are sufficient to prove
> the importance of work, but if still Andres and others have doubt and
> they want to test some specific cases, then sure we can do more
> performance benchmarking.
>
> Mark, thanks for supporting the case for improving Hash Indexes.
>
>
> [1] - https://www.postgresql.org/message-id/CAD__OugX0aOa7qopz3d-nbBAoVmvSmdFJOX4mv5tFRpijqH47A%40mail.gmail.com
> --
> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com
>

+1 Throughout the years, I have seen benchmarks that demonstrated the
performance advantages of even the initial hash index (without WAL)
over the btree of a hash variant. It is pretty hard to dismiss the
O(1) versus O(log(n)) difference. There are classes of problems for
which a hash index is the best solution. Lack of WAL has hamstrung
development in those areas for years.

Regards,
Ken

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jesper Pedersen 2016-09-19 16:11:07 Re: pageinspect: Hash index support
Previous Message Mithun Cy 2016-09-19 15:14:37 Re: "Some tests to cover hash_index"