From: | Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Hash Indexes |
Date: | 2016-09-16 06:56:53 |
Message-ID: | 0e0c6adb-2fd8-d0d5-6b87-71c7478bc753@catalyst.net.nz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 16/09/16 18:35, Amit Kapila wrote:
> On Thu, Sep 15, 2016 at 7:53 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
>> Hi,
>>
>> On 2016-05-10 17:39:22 +0530, Amit Kapila wrote:
>>> For making hash indexes usable in production systems, we need to improve
>>> its concurrency and make them crash-safe by WAL logging them.
>> One earlier question about this is whether that is actually a worthwhile
>> goal. Are the speed and space benefits big enough in the general case?
>>
> I think there will surely by speed benefits w.r.t reads for larger
> indexes. All our measurements till now have shown that there is a
> benefit varying from 30~60% (for reads) with hash index as compare to
> btree, and I think it could be even more if we further increase the
> size of index. On space front, I have not done any detailed study, so
> it is not right to conclude anything, but it appears to me that if the
> index is on char/varchar column where size of key is 10 or 20 bytes,
> hash indexes should be beneficial as they store just hash-key.
>
>> Could those benefits not be achieved in a more maintainable manner by
>> adding a layer that uses a btree over hash(columns), and adds
>> appropriate rechecks after heap scans?
>>
> I don't think it can be faster for reads than using real hash index,
> but surely one can have that as a workaround.
>
>> Note that I'm not saying that hash indexes are not worthwhile, I'm just
>> doubtful that question has been explored sufficiently.
>>
> I think theoretically hash indexes are faster than btree considering
> logarithmic complexity (O(1) vs. O(logn)), also the results after
> recent optimizations indicate that hash indexes are faster than btree
> for equal to searches. I am not saying after the recent set of
> patches proposed for hash indexes they will be better in all kind of
> cases. It could be beneficial for cases where indexed columns are not
> updated heavily.
>
> I think one can definitely argue that we can some optimizations in
> btree and make them equivalent or better than hash indexes, but I am
> not sure if it is possible for all-kind of use-cases.
>
I think having the choice for a more equality optimized index design is
desirable. Now that they are wal logged they are first class citizens so
to speak. I suspect that there are a lot of further speed optimizations
that can be considered to tease out the best performance - now that the
basics of reliability have been sorted. I think this patch/set of
patches is/are important!
regards
Mark
From | Date | Subject | |
---|---|---|---|
Next Message | Vladimir Gordiychuk | 2016-09-16 07:11:48 | Re: Stopping logical replication protocol |
Previous Message | Amit Kapila | 2016-09-16 06:42:40 | Re: Write Ahead Logging for Hash Indexes |