Quick Links

Re: a few crazy ideas about hash joins

From:	"Lawrence, Ramon" <ramon(dot)lawrence(at)ubc(dot)ca>
To:	"Greg Stark" <stark(at)enterprisedb(dot)com>, "Simon Riggs" <simon(at)2ndquadrant(dot)com>
Cc:	"Robert Haas" <robertmhaas(at)gmail(dot)com>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: a few crazy ideas about hash joins
Date:	2009-04-03 18:44:50
Message-ID:	6EEA43D22289484890D119821101B1DF05190DEF@exchange20.mercury.ad.ubc.ca
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

> > I would be especially interested in using a shared memory hash table
> > that *all* backends can use - if the table is mostly read-only, as
> > dimension tables often are in data warehouse applications. That
would
> > give zero startup cost and significantly reduced memory.
>
> I think that's a non-starter due to visibility issues and handling
> inserts and updates. Even just reusing a hash from one execution in a
> later execution of the same plan would be tricky since we would have
> to expire it if the snapshot changes.

If your data set is nearly read-only, materialized views would be a
better way to go and would require no hash join changes.

The idea of perfect hash functions for dimension tables is very
interesting. If the data set is near static, it is possible to compute
them once in a few minutes time for a million tuple table and then
re-use them until they change. The research has shown it is possible,
but I do not know if anyone has actually implemented it in a real DBMS.
An implementation could be something to try if there is interest.

--
Ramon Lawrence

In response to

Re: a few crazy ideas about hash joins at 2009-04-03 17:03:05 from Greg Stark

Responses

Re: a few crazy ideas about hash joins at 2009-04-03 21:18:40 from Grzegorz Jaskiewicz

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2009-04-03 18:49:19	Re: can't load plpython
Previous Message	Alvaro Herrera	2009-04-03 18:00:36	Re: can't load plpython