Re: Deprecating Hash Indexes

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Deprecating Hash Indexes
Date: 2012-10-14 14:37:01
Message-ID: 507ACE0D.1060805@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On 10/14/2012 09:45 AM, Simon Riggs wrote:
> As discussed on other threads, Hash Indexes are currently a broken
> feature and bring us into disrepute.
>
> I describe them as broken because they are neither crash safe, nor
> could they properly be called unlogged indexes (or if so, why just
> hash?). And also because if somebody suggested implementing them the
> way they are now, they would be told they were insane because silent
> data corruption is not something we tolerate anymore. We know why they
> are like that, but its time to remove the rest of the historical
> research legacy. It's hard to say "We respect your data [except if you
> press here]" and be taken seriously.
>
> Suggested actions are
>
> * Put WARNINGs in the docs against the use of hash indexes, backpatch
> to 8.3. CREATE INDEX gives no warning currently, though Index Types
> does mention a caution.
>
> * Mention in the current docs that hash indexes are likely to be
> deprecated completely in future releases. Should anybody ever make
> them work, we can change that advice quickly but I don't think we're
> going to.
>
> Personally, I would like to see them removed into a contrib module to
> allow people to re-add them if they understand the risks. ISTM better
> to confiscate all foot-guns before they go off and then re-issue them
> to marksmen, at the risk of annoying the people that use them with
> full knowledge but that's likely a contentious issue.
>
> Alternate views?

I think we'd be better off to start by implementing unlogged indexes
generally. I have a client who is using hash indexes for the performance
gain on a very heavy insert load precisely because they are unlogged,
and who can get away with it because all their index lookup requirements
are equality. They are aware of the consequences of a crash, and can
manage the risk accordingly.

But there is a lot of attraction in the idea of unlogged btree indexes
for example. And the danger of an unlogged index is substantially less
than that of an unlogged table. After all, your data is still there,
quite crash-safe, and you can thus just rebuild the index after a crash.

cheers

andrew

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Gilles Darold 2012-10-14 14:47:40 pg_dump restore error
Previous Message Fujii Masao 2012-10-14 14:25:59 Re: pg_stat_lwlocks view - lwlocks statistics, round 2