Skip site navigation (1) Skip section navigation (2)

Re: [PERFORM] "Hash index" vs. "b-tree index" (PostgreSQL

From: Mark Lewis <mark(dot)lewis(at)mir3(dot)com>
To: Mischa Sandberg <mischa(dot)sandberg(at)telus(dot)net>
Cc: pgsql-general(at)postgresql(dot)org, pgsql-performance(at)postgresql(dot)org
Subject: Re: [PERFORM] "Hash index" vs. "b-tree index" (PostgreSQL
Date: 2005-05-10 22:46:04
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-generalpgsql-performance
If the original paper was published in 1984, then it's been more than 20
years.  Any potential patents would already have expired, no?

-- Mark Lewis

On Tue, 2005-05-10 at 14:35, Mischa Sandberg wrote:
> Quoting "Jim C. Nasby" <decibel(at)decibel(dot)org>:
> > Well, in a hash-join right now you normally end up feeding at least
> > one
> > side of the join with a seqscan. Wouldn't it speed things up
> > considerably if you could look up hashes in the hash index instead?
> You might want to google on "grace hash" and "hybrid hash".
> The PG hash join is the simplest possible: build a hash table in memory,
> and match an input stream against it.
> *Hybrid hash* is where you spill the hash to disk in a well-designed
> way. Instead of thinking of it as building a hash table in memory, think
> of it as partitioning one input; if some or all of it fits in memory,
> all the better. The boundary condition is the same. 
> The real wizard of hybrid hash has to be Goetz Graefe, who sadly has now
> joined the MS Borg. He demonstrated that for entire-table joins, hybrid
> hash completely dominates sort-merge. MSSQL now uses what he developed
> as an academic, but I don't know what the patent state is.
> "Grace hash" is the original implementation of hybrid hash:
>   Kitsuregawa, M., Tanaka, H., and Moto-oka, T. (1984).
>   Architecture and Performance of Relational Algebra Machine Grace. 
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
>       message can get through to the mailing list cleanly

In response to


pgsql-performance by date

Next:From: Tom LaneDate: 2005-05-10 22:56:21
Subject: Re: [PERFORM] "Hash index" vs. "b-tree index" (PostgreSQL
Previous:From: Jim C. NasbyDate: 2005-05-10 22:15:54
Subject: Re: Partitioning / Clustering

pgsql-general by date

Next:From: Tom LaneDate: 2005-05-10 22:56:21
Subject: Re: [PERFORM] "Hash index" vs. "b-tree index" (PostgreSQL
Previous:From: Martijn van OosterhoutDate: 2005-05-10 22:08:38
Subject: Re: Trigger that spawns forked process

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group