Re: memory leak in trigger handling (since PG12)

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: memory leak in trigger handling (since PG12)
Date: 2023-05-24 13:17:52
Message-ID: 9574e4b6-3334-777b-4287-29d81151963a@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 5/24/23 10:19, Jakub Wartak wrote:
> Hi, just two cents:
>
> On Tue, May 23, 2023 at 8:01 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
>>
>> Hi,
>>
>> On 2023-05-23 13:28:30 -0400, Tom Lane wrote:
>>> Andres Freund <andres(at)anarazel(dot)de> writes:
>>>> Could it help to have a mode where the executor shutdown hook checks how much
>>>> memory is allocated in ExecutorState and warns if its too much?
>>>
>>> It'd be very hard to set a limit for what's "too much", since the amount
>>> of stuff created initially will depend on the plan size.
>>
>> I was thinking of some limit that should really never be reached outside of a
>> leak or work_mem based allocations, say 2GB or so.
>
> RE: instrumentation subthread:
> if that helps then below technique can work somewhat good on normal
> binaries for end users (given there are debug symbols installed), so
> maybe we don't need that much infrastructure added in to see the hot
> code path:
>
> perf probe -x /path/to/postgres 'palloc' 'size=%di:u64' # RDI on
> x86_64(palloc size arg0)
> perf record -avg --call-graph dwarf -e probe_postgres:palloc -aR -p
> <pid> sleep 3 # cannot be longer, huge overhead (~3s=~2GB)
>
> it produces:
> 50.27% (563d0e380670) size=24
> |
> ---palloc
> bms_copy
> ExecUpdateLockMode
> ExecBRUpdateTriggers
> ExecUpdate
> [..]
>
> 49.73% (563d0e380670) size=16
> |
> ---palloc
> bms_copy
> RelationGetIndexAttrBitmap
> ExecUpdateLockMode
> ExecBRUpdateTriggers
> ExecUpdate
> [..]
>
> Now we know that those small palloc() are guilty, but we didn't know
> at the time with Tomas. The problem here is that we do not know in
> palloc() - via its own arguments for which MemoryContext this is going
> to be allocated for. This is problematic for perf, because on RHEL8, I
> was not able to generate an uprobe that was capable of reaching a
> global variable (CurrentMemoryContext) at that time.
>

I think there are a couple even bigger issues:

(a) There are other methods that allocate memory - repalloc, palloc0,
MemoryContextAlloc, ... and so on. But presumably we can trace all of
them at once? We'd have to trace reset/deletes too.

(b) This doesn't say if/when the allocated chunks get freed - even with
a fix, we'll still do exactly the same number of allocations, except
that we'll free the memory shortly after that. I wonder if we could
print a bit more info for each probe, to match the alloc/free requests.

> Additionally what was even more frustrating on diagnosing that case on
> the customer end system, was that such OOMs were crashing other
> PostgreSQL clusters on the same OS. Even knowing the exact guilty
> statement it was impossible to limit RSS memory usage of that
> particular backend. So, what you are proposing makes a lot of sense.
> Also it got me thinking of implementing safety-memory-net-GUC
> debug_query_limit_backend_memory=X MB that would inject
> setrlimit(RLIMIT_DATA) through external extension via hook(s) and
> un-set it later, but the man page states it works for mmap() only
> after Linux 4.7+ so it is future proof but won't work e.g. on RHEL7 -
> maybe that's still good enough?; Or, well maybe try to hack a palloc()
> a little, but that has probably too big overhead, right? (just
> thinking loud).
>

Not sure about setting a hard limit - that seems pretty tricky and may
easily backfire. It's already possible to set such memory limit using
e.g. cgroups, or even better use VMs to isolate the instances.

I certainly agree it's annoying that when OOM hits, we end up with no
information about what used the memory. Maybe we could have a threshold
triggering a call to MemoryContextStats? So that we have at least some
memory usage info in the log in case the OOM killer intervenes.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2023-05-24 13:38:41 Re: memory leak in trigger handling (since PG12)
Previous Message Robert Haas 2023-05-24 12:33:33 Re: Large files for relations