Quick Links

Re: Improve hash join's handling of tuples with null join keys

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>
Cc:	pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject:	Re: Improve hash join's handling of tuples with null join keys
Date:	2025-08-15 16:52:06
Message-ID:	616751.1755276726@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> writes:
> With this patch, “isnull” now becomes true because of the change of strict op. Then the outer null join key tuple must be stored in a tuplestore. When an outer table contains a lot of null join key tuples, then the tuplestore could bump to very large, in that case, it would be hard to say this patch really benefits.

What's your point? If we don't divert those tuples into the
tuplestore, then they will end up in the main hash table instead,
and the consequences of bloat there are far worse.

> Based on this patch, if we are doing a left join, and outer table is empty, then all tuples from the inner table should be returned. In that case, we can skip building a hash table, instead, we can put all inner table tuples into hashtable.innerNullTupleStore. Building a tuplestore should be cheaper than building a hash table, so this way makes a little bit more performance improvement.

I think that would make the logic completely unintelligible. Also,
a totally-empty input relation is not a common situation. We try to
optimize such cases when it's simple to do so, but we shouldn't let
that drive the fundamental design.

regards, tom lane

In response to

Re: Improve hash join's handling of tuples with null join keys at 2025-08-14 02:36:06 from Chao Li

Responses

Re: Improve hash join's handling of tuples with null join keys at 2025-08-18 02:48:08 from Chao Li

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Benoit T	2025-08-15 16:55:39	Re: pg_stat_statements: Add `calls_aborted` counter for tracking query cancellations
Previous Message	Tom Lane	2025-08-15 16:37:18	Re: PoC: pg_dump --filter-data <file> (like Oracle Where Clause on RMAN for specific tables)