Re: 回复:BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: 李海洋(陌痕) <mohen(dot)lhy(at)alibaba-inc(dot)com>
Cc: "feichanghong" <feichanghong(at)qq(dot)com>, "ocean_li_996" <ocean_li_996(at)163(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: 回复:BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset
Date: 2025-09-09 22:24:22
Message-ID: 1239241.1757456662@sss.pgh.pa.us
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

"=?UTF-8?B?5p2O5rW35rSLKOmZjOeXlSk=?=" <mohen(dot)lhy(at)alibaba-inc(dot)com> writes:
> For v04-0002, I checked the commit history, and before commit 133924e1,
> buildSubPlanHash was using ecxt_per_tuple_memory. It seems that the
> "Subplan HashTable Temp Context" was introduced in order to fix a certain bug.

You're quite right that 0002 looks suspiciously like it's reverting
133924e1. However, it doesn't cause the test case added by that
commit to fail (even under Valgrind). I think the reason is what
you say next:

> However, the changed behavior of TupleHashTableMatch introduced by commit
> bf6c614a (noted in [1]) may make the condition:
> ```
> However, the hashtable routines feel free to reset their temp context at any time,
> which'd lead to destroying input data that was still needed.
> ```
> no longer holds true. Then, the lifespan of tempcxt in buildHashtable is similar
> to that of innercontext->ecxt_per_tuple_memory, so it makes sense to merge the two,
> I think.

Looking around in 133924e1^, the only place in execGrouping.c that
would reset the hashtable's tempcxt was execTuplesMatch (which was
called by TupleHashTableMatch). But bf6c614a removed execTuplesMatch.
The modern execGrouping.c code resets hashtable->tempcxt nowhere.
Instead, TupleHashTableMatch applies ExecQualAndReset to
hashtable->exprcontext, so that what is reset is the
ecxt_per_tuple_memory of that econtext --- but that's manufactured by
CreateStandaloneExprContext in BuildTupleHashTable, and has no
connection at all to any context that nodeSubplan.c will touch.
It is certainly not the same ecxt_per_tuple_memory that pre-133924e1
was resetting.

So basically I'm saying that bf6c614a made it okay to revert
133924e1.

> BTW, I ran the test case supported in commit 133924e1 on version not contained commit
> 133924e1 (tag REL8_0_26). I did not find any problems. But i can not find more information
> about this issue.

Digging in the archives, I found the discussion leading to 133924e1 at
https://www.postgresql.org/message-id/flat/i2jnbo%241lcj%241%40news.hub.org

As for trying it on 8.0.26, the commit message for 133924e1 says
specifically that the problem isn't there pre-8.1. I did try to
duplicate your test using 133924e1^, and it didn't fail for me either.
But I'm not too excited about that, because building PG versions this
old on modern platforms is really hard. I had to compile with -O0
to get a build that worked at all, and that's already a significant
difference from the code we would have been testing back in 2010.
It may be that the reason for non-reproduction is buried somewhere in
that, or in the different compiler toolchain. I'm not sure it's worth
a lot of effort to dig deeply into the details.

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message 李海洋 (陌痕) 2025-09-10 01:55:52 回复:回复:BUG #19040: Memory leak in hashed subplan node due to missing hashtempcxt reset
Previous Message PG Bug reporting form 2025-09-09 15:13:41 BUG #19045: Applying custom collation rules appears to erase existing rules