Re: [HACKERS] GSoC on WAL-logging hash indexes

From: "ktm(at)rice(dot)edu" <ktm(at)rice(dot)edu>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tan Tran <tankimtran(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, pgsql-advocacy <pgsql-advocacy(at)postgresql(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] GSoC on WAL-logging hash indexes
Date: 2014-04-30 12:55:44
Message-ID: 20140430125544.GG5746@aart.rice.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-advocacy pgsql-hackers pgsql-students

On Wed, Apr 30, 2014 at 12:26:20AM -0700, Peter Geoghegan wrote:
> On Mon, Mar 3, 2014 at 8:12 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> >> As a GSoC student, I will implement WAL recovery of hash indexes using the
> >> other index types' WAL code as a guide.
>
> Frankly, I'm skeptical of the idea that hash indexes will ever really
> be useful. I realize that that's a counter-intuitive conclusion, but
> there are many things we could do to improve B-Tree CPU costs to make
> them closer to those of hash indexes, without making them any less
> flexible. I myself would much rather work on that, and intend to.
>
> The O(1) cost seems attractive when you consider that that only
> requires that we read one index page from disk to service any given
> index scan, but in fact B-Trees almost always only require the same.
> They are of course also much more flexible. The concurrency
> characteristics B-Trees are a lot better understood. I sincerely
> suggest that we forget about conventional hash table type indexes. I
> fear they're a lost cause.
>
> --
> Peter Geoghegan
>
Hi Peter,

I do not think that CPU costs matter as much as the O(1) probe to
get a result value specifically for very large indexes/tables where
even caching the upper levels of a B-tree index would kill your
working set in memory. I know, I know, everyone has so much memory
and can just buy more... but this does matter. I also think that
development of hash indexes has been stalled waiting for WAL
logging. For example, hash indexes can almost trivially become
more space efficient as they grow in size by utilizing the page
number to represent the prefix bits of the hash value for a bucket.

My 2 cents.
Ken

In response to

Responses

Browse pgsql-advocacy by date

  From Date Subject
Next Message Peter Geoghegan 2014-04-30 16:54:36 Re: [HACKERS] GSoC on WAL-logging hash indexes
Previous Message vincent elschot 2014-04-30 07:29:18 Re: Let's start talking features and "theme" for 9.4

Browse pgsql-hackers by date

  From Date Subject
Next Message Hannu Krosing 2014-04-30 13:05:50 Re: "Considerer Harmful Considered Harmful" categorized as Mostly Harmless
Previous Message Andrew Dunstan 2014-04-30 12:53:19 Re: Considerer Harmful Considered Harmful

Browse pgsql-students by date

  From Date Subject
Next Message Peter Geoghegan 2014-04-30 16:54:36 Re: [HACKERS] GSoC on WAL-logging hash indexes
Previous Message Peter Geoghegan 2014-04-30 07:26:20 Re: GSoC on WAL-logging hash indexes