Re: [PERFORM] "Hash index" vs. "b-tree index" (PostgreSQL

From: "Jim C(dot) Nasby" <decibel(at)decibel(dot)org>
To: Neil Conway <neilc(at)samurai(dot)com>
Cc: Christopher Petrilli <petrilli(at)gmail(dot)com>, Ying Lu <ying_lu(at)cs(dot)concordia(dot)ca>, pgsql-general(at)postgresql(dot)org, pgsql-performance(at)postgresql(dot)org
Subject: Re: [PERFORM] "Hash index" vs. "b-tree index" (PostgreSQL
Date: 2005-05-10 15:32:45
Message-ID: 20050510153245.GA31103@decibel.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-performance

On Tue, May 10, 2005 at 10:14:11AM +1000, Neil Conway wrote:
> Jim C. Nasby wrote:
> >> No, hash joins and hash indexes are unrelated.
> >I know they are now, but does that have to be the case?
>
> I mean, the algorithms are fundamentally unrelated. They share a bit of
> code such as the hash functions themselves, but they are really solving
> two different problems (disk based indexing with (hopefully) good
> concurrency and WAL logging vs. in-memory joins via hashing with spill
> to disk if needed).

Well, in a hash-join right now you normally end up feeding at least one
side of the join with a seqscan. Wouldn't it speed things up
considerably if you could look up hashes in the hash index instead? That
way you can eliminate going to the heap for any hashes that match. Of
course, if limited tuple visibility info was added to hash indexes
(similar to what I think is currently happening to B-tree's), many of
the heap scans could be eliminated as well. A similar method could also
be used for hash aggregates, assuming they use the same hash.
--
Jim C. Nasby, Database Consultant decibel(at)decibel(dot)org
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Douglas McNaught 2005-05-10 15:35:18 Re: pg_dump fails on 7.4 Postgres
Previous Message Michael Glaesemann 2005-05-10 15:24:49 Loading a list of SQL scripts with relative paths

Browse pgsql-performance by date

  From Date Subject
Next Message Jim C. Nasby 2005-05-10 15:39:27 Re: [PERFORM] "Hash index" vs. "b-tree index" (PostgreSQL
Previous Message Alex Stapleton 2005-05-10 15:24:23 Re: Partitioning / Clustering