Re: WIP: bloom filter in Hash Joins with batches

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: "Shulgin, Oleksandr" <oleksandr(dot)shulgin(at)zalando(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: bloom filter in Hash Joins with batches
Date: 2015-12-28 02:15:53
Message-ID: CAKJS1f8qTxf2nqFLQ-koXeUWNxJX0b63ceaRcDz-F68hNzy9dA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 18 December 2015 at 04:34, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
wrote:

> I think ultimately we'll need to measure the false positive rate, so that
> we can use it to dynamically disable the bloom filter if it gets
> inefficient. Also maybe put some of that into EXPLAIN ANALYZE.
>

I'm not so convinced that will be a good idea. What if the filter does not
help much to start with, we disable it because of that, then we get some
different probe values later in the scan which the bloom filter would have
helped to eliminate earlier.

Maybe it would be better to, once the filter is built, simply count the
number of 1 bits and only use the filter if there's less than <threshold> 1
bits compared to the size of the filter in bits. There's functionality in
bms_num_members() to do this, and there's also __builtin_popcount() in
newer version of GCC, which we could have some wrapper around, perhaps.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2015-12-28 06:22:16 Re: Remove Windows crash dump support?
Previous Message Amit Langote 2015-12-28 00:31:24 Re: [PROPOSAL] VACUUM Progress Checker.