Re: Memory leak from ExecutorState context?

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Jehan-Guillaume de Rorthais <jgdr(at)dalibo(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Memory leak from ExecutorState context?
Date: 2023-02-28 19:51:02
Message-ID: 45d453c8-b2d3-b477-36eb-32fdf4455f3c@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2/28/23 19:06, Jehan-Guillaume de Rorthais wrote:
> Hello all,
>
> A customer is facing out of memory query which looks similar to this situation:
>
> https://www.postgresql.org/message-id/flat/12064.1555298699%40sss.pgh.pa.us#eb519865575bbc549007878a5fb7219b
>
> This PostgreSQL version is 11.18. Some settings:
>
> * shared_buffers: 8GB
> * work_mem: 64MB
> * effective_cache_size: 24GB
> * random/seq_page_cost are by default
> * physical memory: 32GB
>
> The query is really large and actually update kind of a materialized view.
>
> The customer records the plans of this query on a regular basis. The explain
> analyze of this query before running out of memory was:
>
> https://explain.depesz.com/s/sGOH
>
> The customer is aware he should rewrite this query to optimize it, but it's a
> long time process he can not start immediately. To make it run in the meantime,
> he actually removed the top CTE to a dedicated table. According to their
> experience, it's not the first time they had to split a query this way to make
> it work.
>
> I've been able to run this query on a standby myself. I've "call
> MemoryContextStats(TopMemoryContext)" every 10s on a run, see the data parsed
> (best view with "less -S") and the graph associated with it in attachment. It
> shows:
>
> * HashBatchContext goes up to 1441MB after 240s then stay flat until the end
> (400s as the last record)

That's interesting. We're using HashBatchContext for very few things, so
what could it consume so much memory? But e.g. the number of buckets
should be limited by work_mem, so how could it get to 1.4GB?

Can you break at ExecHashIncreaseNumBatches/ExecHashIncreaseNumBuckets
and print how many batches/butches are there?

> * ALL other context are stable before 240s, but ExecutorState
> * ExecutorState keeps rising up to 13GB with no interruption until the memory
> exhaustion
>
> I did another run with interactive gdb session (see the messy log session in
> attachment, for what it worth). Looking at some backtraces during the memory
> inflation close to the end of the query, all of them were having these frames in
> common:
>
> [...]
> #6 0x0000000000621ffc in ExecHashJoinImpl (parallel=false, pstate=0x31a3378)
> at nodeHashjoin.c:398 [...]
>
> ...which is not really helpful but at least, it seems to come from a hash join
> node or some other hash related code. See the gdb session log for more details.
> After the out of mem, pmap of this process shows:
>
> 430: postgres: postgres <dbname> [local] EXPLAIN
> Address Kbytes RSS Dirty Mode Mapping
> [...]
> 0000000002c5e000 13719620 8062376 8062376 rw--- [ anon ]
> [...]
>
> Is it usual a backend is requesting such large memory size (13 GB) and
> actually use less of 60% of it (7.7GB of RSS)?
>

No idea. Interpreting this info is pretty tricky, in my experience. It
might mean the memory is no longer used but sbrk couldn't return it to
the OS yet, or something like that.

> Sadly, the database is 1.5TB large and I can not test on a newer major version.
> I did not try to check how large would be the required data set to reproduce
> this, but it moves 10s of million of rows from multiple tables anyway...
>
> Any idea? How could I help to have a better idea if a leak is actually
> occurring and where exactly?
>

Investigating memory leaks is tough, especially for generic memory
contexts like ExecutorState :-( Even more so when you can't reproduce it
on a machine with custom builds.

What I'd try is this:

1) attach breakpoints to all returns in AllocSetAlloc(), printing the
pointer and size for ExecutorState context, so something like

break aset.c:783 if strcmp("ExecutorState",context->header.name) == 0
commands
print MemoryChunkGetPointer(chunk) size
cont
end

2) do the same for AllocSetFree()

3) Match the palloc/pfree calls (using the pointer address), to
determine which ones are not freed and do some stats on the size.
Usually there's only a couple distinct sizes that account for most of
the leaked memory.

4) Break AllocSetAlloc on those leaked sizes, to determine where the
calls come from.

This usually gives enough info about the leak or at least allows
focusing the investigation to a particular area of code.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2023-02-28 19:57:40 Re: Beautify pg_walinspect docs a bit
Previous Message Jacob Champion 2023-02-28 19:50:55 Re: Auth extensions, with an LDAP/SCRAM example [was: Proposal: Support custom authentication methods using hooks]