Re: Hash Joins vs. Bloom Filters / take 2

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Claudio Freire <klaussfreire(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hash Joins vs. Bloom Filters / take 2
Date: 2018-02-22 21:29:12
Message-ID: CAH2-Wzko6NU=4yBVGaJgBvR+fSOiRjJRyk8GWtow-S5n++FxBQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Feb 22, 2018 at 1:14 PM, Tomas Vondra
<tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> OK, thanks for reminding me about SBF and for the discussion.
>
> At this point I'll probably focus on the other parts though -
> determining selectivity of the join, etc. Which I think is crucial, and
> we need to get that right even for accurate estimates. It's good to know
> that we have a solution for that part, though.

+1

There are probably significant opportunities to improve the Bloom
filter. That isn't that interesting right now, though. Figuring out
how scalable Bloom filters might save hash join from being reliant on
the accuracy of the initial estimate of set cardinality seems
premature at best, since we haven't established how sensitive this
use-case is to misestimations. My sense is that it's actually
naturally very insensitive, but there is no need to spend too much
time on it just yet.

It just makes sense, as a matter of procedure, to focus on the hash
join code, and drill down from there. Personally, I'm tired of talking
about the nitty-gritty details of Bloom filters rather than the actual
problem at hand.

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2018-02-22 22:49:48 Re: file cloning in pg_upgrade and CREATE DATABASE
Previous Message Robert Haas 2018-02-22 21:23:24 Re: [HACKERS] Partition-wise aggregation/grouping