Re: WIP: bloom filter in Hash Joins with batches

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: bloom filter in Hash Joins with batches
Date: 2016-01-10 21:38:22
Message-ID: CAKJS1f9hCxfYrvxpEWROahi8jhmg47nFO0jdbd=aAupLkQ2JoQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11 January 2016 at 09:30, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
wrote:

> Hi,
>
> On 01/10/2016 04:03 AM, Peter Geoghegan wrote:
>
>> On Sat, Jan 9, 2016 at 4:08 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
>
> Also, are you aware of this?
>>
>>
>> http://www.nus.edu.sg/nurop/2010/Proceedings/SoC/NUROP_Congress_Cheng%20Bin.pdf
>>
>> It talks about bloom filters for hash joins in PostgreSQL
>> specifically. Interestingly, they talk about specific TPC-H queries.
>>
>
> Interesting. The way that paper uses bloom filters is very different from
> what I do in the patch. They build the bloom filters and then propagate
> them into the scan nodes to eliminate the tuples early.
>
>
That does sound interesting, but unless I'm somehow mistaken, I guess to do
that you'd have to abandon the more efficient hashing of the hash value
that you're doing in the current patch, and hash the complete value in the
scan node, then hash them again if they make it into the hash join node.
That does not sound like it would be a win if hashing longer varlana values.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2016-01-10 21:44:45 Re: ExecGather() + nworkers
Previous Message Tomas Vondra 2016-01-10 21:11:42 Re: WIP: bloom filter in Hash Joins with batches