Re: hashjoins vs. Bloom filters (yet again)

From: Tomas Vondra <tomas(at)vondra(dot)me>
To: Matheus Alcantara <matheusssilv97(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: hashjoins vs. Bloom filters (yet again)
Date: 2026-07-03 10:10:33
Message-ID: ff89ad50-4bad-43d2-9c06-4a433551f35d@vondra.me
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 7/2/26 22:31, Matheus Alcantara wrote:
> On Wed Jul 1, 2026 at 10:25 AM -03, Tomas Vondra wrote:
>> (d) At planning / execution this works similarly to v1, except that all
>> the decisions were already done. The code in createplan.c and executor
>> is more for book-keeping and allowing lookup of filters.
>>
>
> I'm wondering if the adaptive probing can mess the execution of the
> choose plan. Let's say that a plan was chosen with bloom filters to
> pushdown because it would reduce e.g 50% of the rows, what if at
> runtime the bloom filter is proved to not be effective and it get
> disabled making the scan node to produce 100% of the rows to the node
> above that is not expecting? Do we end up with the same issue "expected
> x actual rows"?
>
> I think that if we keep using the unnefective bloom filter the scan node
> will still produce more rows than expected, but perhaps this is easier
> to understand?
>

Yes. If we pick a filter expecting it to eliminate 99% of tuples, but
then find it does not eliminate any tuples, that'll affect the counts in
explain. But that's simply how misestimates work, it's not specific to
filter pushdown. It may happen for WHERE clauses etc.

The estimate would hit us at some point anyway - it's the selectivity of
the join, so even without the pushdown the cardinality would be off
above the join.

Also, what else could we do? I don't think we can do planning assuming
the estimates are off - we have to assume the estimates are OK. We might
consider how "safe" a given estimate is, and then maybe not push down
filters for "risky" ones, but we don't have any such capability.

>> I haven't incorporated the two patches posted by Andrew:
>>
>> 1) making it work with CustomScans
>>
>> 2) supporting per-key filters
>>
>> 3) allow eager creation of filters (disable delayed Hash build)
>>
>> I agree those seem like a worthwhile improvements, and the patches
>> seemed to be OK too, but I was focusing on reworking the planning. Based
>> on some off-list discussion, Andrew (or one of his colleagues) should be
>> able to adjust those for this v3 patch.
>>
>
> I'm attaching a new v4 patchset incorporating Andrew patches with test
> cases. 0001 and 0002 are your v3 untouched, 0003 is some tests added to
> exercice the CustomScan path and 0004 is the Andrew changes with a few
> changes required from v3 version:
>
> Unlike the v1 PoC that pushed filters down in create_hashjoin_plan
> (where it could simply walk the finished plan tree and accept any scan
> node), the filters are now decided during bottom-up path construction,
> so a scan only receives a filter if a filter-bearing path was generated
> for its base relation. So the main change is teaching path generation
> about the custom scan.
>

Thanks, I'll take a look early next week. My plan is to create a branch
carrying all the pieces, and then gradually refine that.

One thing I'm a bit concerned about is the CustomScan support. We don't
have a single module in the tree, so how would we test the changes? I
think it might be necessary to introduce a minimal CustomScan module, so
that we can test the new pieces.

>> In fact, I'm thinking it might be better to abstract the "filter" and
>> stop thinking just about Bloom filters. There probably are other kinds
>> of interesting filters, so maybe we should not assume all filters are
>> Bloom filters. Those filters probably won't be that different, it's more
>> about not putting "Bloom" in naming etc.
>>
>
> I was also thinking about this when reading v1. I was checking other
> databases and I realize that Trino [1] and Duckdb (I didn't find any
> documentation from Duckdb but you can see this information on explain
> analyse output) have "Dynamic Filtering" and IIUC is a similar
> optimization that this patch is about.
>
> I'm wondering if we could name as "Dynamic Filtering" or "Runtime filter
> pushdown" or something else instead of Bloom Filters to make it more
> generic.
>
> On explain(analyze) output we can show as as "Runtime filter" and put
> the columns (perhaps the values?) used on the filter.
>
> I think that this can abstract and let the implementation decide if it
> will use a bloom filter or not. It's a rough idea, but what do you
> think?
>

Yeah. I don't think the name is that important, it's more about what
kind of flexibility is needed.

> I'm still thinking about your other points and about the XXX comments.
> I'll share more soon.
>

Thanks!

--
Tomas Vondra

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Yugo Nagata 2026-07-03 10:11:16 Re: Incremental View Maintenance, take 2 (design considerations)
Previous Message Alexandre Felipe 2026-07-03 10:02:30 Re: SLOPE - Planner optimizations on monotonic expressions.