Quick Links

回复：BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset

From:	李海洋(陌痕) <mohen(dot)lhy(at)alibaba-inc(dot)com>
To:	"PG Bug reporting form" <noreply(at)postgresql(dot)org>
Cc:	费长红(仲祜) <feichanghong(dot)fch(at)alibaba-inc(dot)com>, "tgl" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject:	回复：BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset
Date:	2025-09-03 00:06:31
Message-ID:	b748477c-6778-4339-ba3d-ec815a7f49af.mohen.lhy@alibaba-inc.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-bugs

I've attached 'v01_fix_memory_leak_in_hashed_subplan_node.patch' to address this.
--
Thanks,
Haiyang Li
------------------------------------------------------------------
发件人：PG Bug reporting form <noreply(at)postgresql(dot)org>
发送时间：2025年9月3日(周三) 01:27
收件人："pgsql-bugs"<pgsql-bugs(at)lists(dot)postgresql(dot)org>
抄　送："李海洋(陌痕)"<mohen(dot)lhy(at)alibaba-inc(dot)com>
主　题：BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset
The following bug has been logged on the website:
Bug reference: 19040
Logged by: haiyang li
Email address: mohen(dot)lhy(at)alibaba-inc(dot)com
PostgreSQL version: 18beta3
Operating system: centos7 5.10.84 x86_64
Description:
Hello, all!
I found a query which consumes a lot of memory and triggers OOM killer.
Memory leak occurs in hashed subplan node.
I was able to create reproducible test case on machine with default config
and postgresql 18beta3:
CREATE TABLE test1(
a numeric,
b int);
INSERT INTO
test1
SELECT
i,
i
FROM
generate_series(1, 30000000) i; -- Make the running time longer
EXPLAIN ANALYZE SELECT
*
FROM
test1
WHERE
a NOT IN(
SELECT
i
FROM
generate_series(1, 10000) i
);
plan:
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
Seq Scan on test1 (cost=125.00..612432.24 rows=15000108 width=10) (actual
time=135.191..25832.808 rows=29990000 loops=1)
Filter: (NOT (hashed SubPlan 1))
Rows Removed by Filter: 10000
SubPlan 1
-> Function Scan on generate_series i (cost=0.00..100.00 rows=10000
width=4) (actual time=36.999..38.296 rows=10000 loops=1)
Planning Time: 0.280 ms
JIT:
Functions: 15
Options: Inlining true, Optimization true, Expressions true, Deforming
true
Timing: Generation 1.155 ms, Inlining 25.929 ms, Optimization 60.700 ms,
Emission 23.018 ms, Total 110.802 ms
Execution Time: 28217.026 ms
(11 rows)
I observed that the process's RES (resident memory) was increasing rapidly
during SQL execution by using 'top -p <pid>' command.
Furthermore, during SQL execution, I ran 'select
pg_log_backend_memory_contexts(<pid>)'
to print memory context statistics. The context with abnormally high memory
usage was
"Subplan HashTable Temp Context." The key part of the log is as follows:
...
LOG: level: 5; Subplan HashTable Temp Context: 514834432 total in 62849
blocks; 973712 free (60695 chunks); 513860720 used
LOG: level: 5; Subplan HashTable Context: 524288 total in 7 blocks; 123808
free (5 chunks); 400480 used
...
Grand total: 518275344 bytes in 63116 blocks; 2025560 free (60976 chunks);
516249784 used
...
If I change the SQL from "a NOT IN" to "b NOT IN" and do the same action, I
can not
observe abnormally high memory usage. Likewise, the key part of the log is
as follows:
...
LOG: level: 5; Subplan HashTable Temp Context: 1024 total in 1 blocks; 784
free (0 chunks); 240 used
LOG: level: 5; Subplan HashTable Context: 524288 total in 7 blocks; 123808
free (5 chunks); 400480 used
...
Grand total: 3441936 bytes in 268 blocks; 1050520 free (281 chunks); 2391416
used
...
While analyzing the source code, I found that the hashed subplan node fails
to reset
the 'hashtempcxt' context after probing the hash table for each slot.
When variable-length datatypes (e.g., numeric) are processed, this can
trigger calls
to 'detoast_attr', which allocate memory in hashtempcxt. Without a reset,
this memory
is not reclaimed until the context itself is destroyed, resulting in a
memory leak
when processing large numbers of slots.
A patch implementing this fix will be included in the follow-up email.
--
Thanks,
Haiyang Li

Attachment	Content-Type	Size
v01_fix_memory_leak_in_hashed_subplan_node.patch	application/octet-stream	4.2 KB

In response to

BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset at 2025-09-02 15:58:49 from PG Bug reporting form

Browse pgsql-bugs by date

	From	Date	Subject
Next Message	Richard Guo	2025-09-03 00:42:34	Re: BUG #19037: Planner fails on estimating array length with "no relation entry" error
Previous Message	PG Bug reporting form	2025-09-02 15:58:49	BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset