Quick Links

Re: a few crazy ideas about hash joins

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: a few crazy ideas about hash joins
Date:	2009-04-03 16:41:59
Message-ID:	1238776919.5444.201.camel@ebony.2ndQuadrant
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Thu, 2009-04-02 at 22:08 -0400, Robert Haas wrote:

> 3. Avoid building the exact same hash table twice in the same query.
> This happens more often you'd think. For example, a table may have
> two columns creator_id and last_updater_id which both reference person
> (id). If you're considering a hash join between paths A and B, you
> could conceivably check whether what is essentially a duplicate of B
> has already been hashed somewhere within path A. If so, you can reuse
> that same hash table at zero startup-cost.

This is also interesting because there is potential to save memory
through that approach, which allows us to allocate work_mem higher and
avoid multi-batch altogether.

I would be especially interested in using a shared memory hash table
that *all* backends can use - if the table is mostly read-only, as
dimension tables often are in data warehouse applications. That would
give zero startup cost and significantly reduced memory.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

In response to

a few crazy ideas about hash joins at 2009-04-03 02:08:40 from Robert Haas

Responses

Re: a few crazy ideas about hash joins at 2009-04-03 17:03:05 from Greg Stark

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Greg Stark	2009-04-03 16:55:12	Re: a few crazy ideas about hash joins
Previous Message	Tom Lane	2009-04-03 16:02:51	Re: 8.4 open items list