Re: [PERFORM] "Hash index" vs. "b-tree index" (PostgreSQL

From: Mark Lewis <mark(dot)lewis(at)mir3(dot)com>
To: Mischa Sandberg <mischa(dot)sandberg(at)telus(dot)net>
Cc: pgsql-general(at)postgresql(dot)org, pgsql-performance(at)postgresql(dot)org
Subject: Re: [PERFORM] "Hash index" vs. "b-tree index" (PostgreSQL
Date: 2005-05-10 22:46:04
Message-ID: 1115765164.13684.25.camel@amateljan.mirlogic.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-performance

If the original paper was published in 1984, then it's been more than 20
years. Any potential patents would already have expired, no?

-- Mark Lewis

On Tue, 2005-05-10 at 14:35, Mischa Sandberg wrote:
> Quoting "Jim C. Nasby" <decibel(at)decibel(dot)org>:
>
> > Well, in a hash-join right now you normally end up feeding at least
> > one
> > side of the join with a seqscan. Wouldn't it speed things up
> > considerably if you could look up hashes in the hash index instead?
>
> You might want to google on "grace hash" and "hybrid hash".
>
> The PG hash join is the simplest possible: build a hash table in memory,
> and match an input stream against it.
>
> *Hybrid hash* is where you spill the hash to disk in a well-designed
> way. Instead of thinking of it as building a hash table in memory, think
> of it as partitioning one input; if some or all of it fits in memory,
> all the better. The boundary condition is the same.
>
> The real wizard of hybrid hash has to be Goetz Graefe, who sadly has now
> joined the MS Borg. He demonstrated that for entire-table joins, hybrid
> hash completely dominates sort-merge. MSSQL now uses what he developed
> as an academic, but I don't know what the patent state is.
>
> "Grace hash" is the original implementation of hybrid hash:
> Kitsuregawa, M., Tanaka, H., and Moto-oka, T. (1984).
> Architecture and Performance of Relational Algebra Machine Grace.
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2005-05-10 22:56:21 Re: [PERFORM] "Hash index" vs. "b-tree index" (PostgreSQL
Previous Message Martijn van Oosterhout 2005-05-10 22:08:38 Re: Trigger that spawns forked process

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2005-05-10 22:56:21 Re: [PERFORM] "Hash index" vs. "b-tree index" (PostgreSQL
Previous Message Jim C. Nasby 2005-05-10 22:15:54 Re: Partitioning / Clustering