| From: | "Joel Jacobson" <joel(at)compiler(dot)org> |
|---|---|
| To: | "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Chao Li" <li(dot)evan(dot)chao(at)gmail(dot)com> |
| Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
| Subject: | Re: Improve hash join's handling of tuples with null join keys |
| Date: | 2026-03-04 07:04:38 |
| Message-ID: | 3ccec807-d2ff-4150-9585-74cc52926646@app.fastmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Tue, Mar 3, 2026, at 21:58, Tom Lane wrote:
> I wrote:
>> Bug #19030 [1] seems to be a fresh report of the problem this patch
>> aims to solve. While answering that, I realized that the v2 patch
>> causes null-keyed inner rows to not be included in EXPLAIN ANALYZE's
>> report of the number of rows output by the Hash node. Now on the
>> one hand, what it's reporting is an accurate reflection of the
>> number of rows in the hash table, which perhaps is useful. On the
>> other hand, it's almost surely going to confuse users, and it's
>> different from the number we produced before. Should we try to
>> preserve the old behavior here? (I've not looked at what code
>> changes would be needed for that.)
>
> I got around to looking at that finally. It's not terribly difficult
> to fix, but while figuring out which counters were used for what,
> I noticed a pre-existing bug: when ExecHashRemoveNextSkewBucket moves
> tuples into the main hash table from the skew hash table, it fails to
> adjust hashtable->skewTuples, meaning that subsequent executions of
> ExecHashTableInsert will have the wrong idea of how many tuples are in
> the main table. The error is probably not very large because the
> skew table is not supposed to be big relative to the main table,
> but still, it's wrong. So I tried to clean that up here.
>
> 0001 attached is the same patch as before (brought up to HEAD, but
> only line numbers change). 0002 is the new code to fix these
> tuple-counting issues.
>
> regards, tom lane
I've tested v3-0001 and v3-0002 and can confirm the bug introduced
in v3-0001 is fixed in v3-0002:
% cat explain-analyze-problem.sql
CREATE TABLE ea_hash (id int);
INSERT INTO ea_hash SELECT g FROM generate_series(1, 10) g;
INSERT INTO ea_hash SELECT NULL FROM generate_series(1, 90);
ANALYZE ea_hash;
CREATE TABLE ea_probe (id int);
INSERT INTO ea_probe SELECT (g % 10) + 1 FROM generate_series(1, 10000) g;
ANALYZE ea_probe;
SET enable_nestloop = off;
SET enable_mergejoin = off;
EXPLAIN (COSTS OFF, ANALYZE, TIMING OFF, BUFFERS OFF, SUMMARY OFF)
SELECT count(*) FROM ea_probe FULL OUTER JOIN ea_hash ON ea_probe.id = ea_hash.id;
EXPLAIN (COSTS OFF, ANALYZE, TIMING OFF, BUFFERS OFF, SUMMARY OFF)
SELECT count(*) FROM ea_probe RIGHT OUTER JOIN ea_hash ON ea_probe.id = ea_hash.id;
% git diff --no-index master.out v3-0001.out
diff --git a/master.out b/v3-0001.out
index 1e05e7e39a6..54210c49757 100644
--- a/master.out
+++ b/v3-0001.out
@@ -17,8 +17,8 @@ SET
-> Hash Full Join (actual rows=10090.00 loops=1)
Hash Cond: (ea_probe.id = ea_hash.id)
-> Seq Scan on ea_probe (actual rows=10000.00 loops=1)
- -> Hash (actual rows=100.00 loops=1)
- Buckets: 1024 Batches: 1 Memory Usage: 12kB
+ -> Hash (actual rows=10.00 loops=1)
+ Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on ea_hash (actual rows=100.00 loops=1)
(7 rows)
@@ -29,8 +29,8 @@ SET
-> Hash Right Join (actual rows=10090.00 loops=1)
Hash Cond: (ea_probe.id = ea_hash.id)
-> Seq Scan on ea_probe (actual rows=10000.00 loops=1)
- -> Hash (actual rows=100.00 loops=1)
- Buckets: 1024 Batches: 1 Memory Usage: 12kB
+ -> Hash (actual rows=10.00 loops=1)
+ Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on ea_hash (actual rows=100.00 loops=1)
(7 rows)
% git diff --no-index v3-0001.out v3-0002.out
diff --git a/v3-0001.out b/v3-0002.out
index 54210c49757..17dfe335b0b 100644
--- a/v3-0001.out
+++ b/v3-0002.out
@@ -17,7 +17,7 @@ SET
-> Hash Full Join (actual rows=10090.00 loops=1)
Hash Cond: (ea_probe.id = ea_hash.id)
-> Seq Scan on ea_probe (actual rows=10000.00 loops=1)
- -> Hash (actual rows=10.00 loops=1)
+ -> Hash (actual rows=100.00 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on ea_hash (actual rows=100.00 loops=1)
(7 rows)
@@ -29,7 +29,7 @@ SET
-> Hash Right Join (actual rows=10090.00 loops=1)
Hash Cond: (ea_probe.id = ea_hash.id)
-> Seq Scan on ea_probe (actual rows=10000.00 loops=1)
- -> Hash (actual rows=10.00 loops=1)
+ -> Hash (actual rows=100.00 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on ea_hash (actual rows=100.00 loops=1)
(7 rows)
% git diff --no-index master.out v3-0002.out
diff --git a/master.out b/v3-0002.out
index 1e05e7e39a6..17dfe335b0b 100644
--- a/master.out
+++ b/v3-0002.out
@@ -18,7 +18,7 @@ SET
Hash Cond: (ea_probe.id = ea_hash.id)
-> Seq Scan on ea_probe (actual rows=10000.00 loops=1)
-> Hash (actual rows=100.00 loops=1)
- Buckets: 1024 Batches: 1 Memory Usage: 12kB
+ Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on ea_hash (actual rows=100.00 loops=1)
(7 rows)
@@ -30,7 +30,7 @@ SET
Hash Cond: (ea_probe.id = ea_hash.id)
-> Seq Scan on ea_probe (actual rows=10000.00 loops=1)
-> Hash (actual rows=100.00 loops=1)
- Buckets: 1024 Batches: 1 Memory Usage: 12kB
+ Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on ea_hash (actual rows=100.00 loops=1)
(7 rows)
/Joel
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Hayato Kuroda (Fujitsu) | 2026-03-04 07:07:51 | RE: BUG: Former primary node might stuck when started as a standby |
| Previous Message | Daniil Davydov | 2026-03-04 06:58:49 | Re: POC: Parallel processing of indexes in autovacuum |