Re: memory leak in trigger handling (since PG12)

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: memory leak in trigger handling (since PG12)
Date: 2023-05-24 08:49:14
Message-ID: a0f903db-292d-2ec3-79ea-0ab0d63ff3bc@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 5/23/23 23:39, Tom Lane wrote:
> Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> writes:
>> The really hard thing was determining what causes the memory leak - the
>> simple instrumentation doesn't help with that at all. It tells you there
>> might be a leak, but you don't know where did the allocations came from.
>
>> What I ended up doing is a simple gdb script that sets breakpoints on
>> all palloc/pfree variants, and prints info (including the backtrace) for
>> each call on ExecutorState. And then a script that aggregate those to
>> identify which backtraces allocated most chunks that were not freed.
>
> FWIW, I've had some success localizing palloc memory leaks with valgrind's
> leak detection mode. The trick is to ask it for a report before the
> context gets destroyed. Beats writing your own infrastructure ...
>

I haven't tried valgrind, so can't compare.

Would it be possible to filter which memory contexts to track? Say we
know the leak is in ExecutorState, so we don't need to track allocations
in other contexts. That was a huge speedup for me, maybe it'd help
valgrind too.

Also, how did you ask for the report before context gets destroyed?

>> Would require testing with more data, though. I doubt we'd find much
>> with our regression tests, which have tiny data sets.
>
> Yeah, it's not clear whether we could make the still-hypothetical check
> sensitive enough to find leaks using small test cases without getting an
> unworkable number of false positives. Still, might be worth trying.

I'm not against experimenting with that. Were you thinking about
something that'd be cheap enough to just be enabled always/everywhere,
or something we'd enable during testing?

This reminded me a strangeloop talk [1] [2] about the Scalene memory
profiler from UMass. That's for Python, but they did some smart tricks
to reduce the cost of profiling - maybe we could do something similar,
possibly by extending the memory contexts a bit.

[1] https://youtu.be/vVUnCXKuNOg?t=1405
[2] https://youtu.be/vVUnCXKuNOg?t=1706

> It might be an acceptable tradeoff to have stricter rules for what can
> be allocated in ExecutorState in order to make this sort of problem
> more easily detectable.
>

Would these rules be just customary, or defined/enforced in the code
somehow? I can't quite imagine how would that work, TBH.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nikita Malakhov 2023-05-24 09:16:50 Re: RFI: Extending the TOAST Pointer
Previous Message Michael Paquier 2023-05-24 08:22:14 Cutting support for OpenSSL 1.0.1 and 1.0.2 in 17~?