Re: bitmap AM design

From: pgsql(at)mohawksoft(dot)com
To: "Neil Conway" <neilc(at)samurai(dot)com>
Cc: "Pailloncy Jean-Gerard" <jg(at)rilk(dot)com>, "Hannu Krosing" <hannu(at)tm(dot)ee>, pgsql(at)mohawksoft(dot)com, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Pg Hackers" <pgsql-hackers(at)postgresql(dot)org>, "Victor Y(dot) Yegorov" <viy(at)mits(dot)lv>
Subject: Re: bitmap AM design
Date: 2005-03-04 12:29:33
Message-ID: 16807.24.91.171.78.1109939373.squirrel@mail.mohawksoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> Pailloncy Jean-Gerard wrote:
>> You should have a look to this thread
>> http://archives.postgresql.org/pgsql-hackers/2005-02/msg00263.php
>>
>> Take a look at this paper about "lock-free parallel hash table"
>> http://www.cs.rug.nl/~wim/mechver/hashtable/
>
> Is this relevant? Hash indexes are on-disk data structures, so ISTM
> lock-free algorithms aren't really applicable.
>
> (BTW, is poor concurrency really the biggest issue with hash indexes? If
> so, there is some low-hanging fruit that I noticed a few years ago, but
> never got around to fixing: _hash_doinsert() write-locks the hash
> metapage on every insertion merely to increment a tuple counter. This
> could be improved by just acquiring the lock with probability 1/k, and
> incrementing the counter k times -- or some similar statistical
> approximation. IMHO there are bigger issues with hash indexes, like the
> lack of WAL safety, the very slow index build times, and their
> relatively poor performance -- i.e. no better than btree for any
> workload I've seen. If someone wants to step up to the plate and fix
> some of that, I'll improve hash index concurrency -- any takers? :-) )
>

As always, I'd love to have the time to do this stuff, but as always, I'm
not in the position to spend any time on it. It's frustrating.

Anyway, IMHO, hash indexes would be dramatically improved if you could
specify your own hashing function and declare initial table size. If you
could do that, and work on an assumption that the hashing index was for
fairly static data, it could handle many needs. As it stands, hash indexes
are virtually useless.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2005-03-04 13:46:21 Re: refactoring fork() and EXEC_BACKEND
Previous Message Neil Conway 2005-03-04 09:47:03 Re: bitmap AM design