| From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
|---|---|
| To: | Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> |
| Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
| Subject: | Re: Improve hash join's handling of tuples with null join keys |
| Date: | 2026-03-03 20:58:16 |
| Message-ID: | 1290278.1772571496@sss.pgh.pa.us |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
I wrote:
> Bug #19030 [1] seems to be a fresh report of the problem this patch
> aims to solve. While answering that, I realized that the v2 patch
> causes null-keyed inner rows to not be included in EXPLAIN ANALYZE's
> report of the number of rows output by the Hash node. Now on the
> one hand, what it's reporting is an accurate reflection of the
> number of rows in the hash table, which perhaps is useful. On the
> other hand, it's almost surely going to confuse users, and it's
> different from the number we produced before. Should we try to
> preserve the old behavior here? (I've not looked at what code
> changes would be needed for that.)
I got around to looking at that finally. It's not terribly difficult
to fix, but while figuring out which counters were used for what,
I noticed a pre-existing bug: when ExecHashRemoveNextSkewBucket moves
tuples into the main hash table from the skew hash table, it fails to
adjust hashtable->skewTuples, meaning that subsequent executions of
ExecHashTableInsert will have the wrong idea of how many tuples are in
the main table. The error is probably not very large because the
skew table is not supposed to be big relative to the main table,
but still, it's wrong. So I tried to clean that up here.
0001 attached is the same patch as before (brought up to HEAD, but
only line numbers change). 0002 is the new code to fix these
tuple-counting issues.
regards, tom lane
| Attachment | Content-Type | Size |
|---|---|---|
| v3-0001-Improve-hash-join-s-handling-of-tuples-with-null-.patch | text/x-diff | 36.2 KB |
| v3-0002-Fix-tuple-counting-issues-in-hash-joins.patch | text/x-diff | 7.5 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Jeff Davis | 2026-03-03 21:01:48 | Re: Use CASEFOLD() internally rather than LOWER() |
| Previous Message | Masahiko Sawada | 2026-03-03 20:52:26 | Re: Use allocation macros in the logical replication code |