Re: Memory leak from ExecutorState context?

From: Jehan-Guillaume de Rorthais <jgdr(at)dalibo(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Memory leak from ExecutorState context?
Date: 2023-03-27 21:13:23
Message-ID: 20230327231323.08277083@karst
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Mon, 20 Mar 2023 15:12:34 +0100
Jehan-Guillaume de Rorthais <jgdr(at)dalibo(dot)com> wrote:

> On Mon, 20 Mar 2023 09:32:17 +0100
> Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>
> > >> * Patch 1 could be rebased/applied/backpatched
> > >
> > > Would it help if I rebase Patch 1 ("move BufFile stuff into separate
> > > context")?
> >
> > Yeah, I think this is something we'd want to do. It doesn't change the
> > behavior, but it makes it easier to track the memory consumption etc.
>
> Will do this week.

Please, find in attachment a patch to allocate bufFiles in a dedicated context.
I picked up your patch, backpatch'd it, went through it and did some minor
changes to it. I have some comment/questions thought.

1. I'm not sure why we must allocate the "HashBatchFiles" new context
under ExecutorState and not under hashtable->hashCxt?

The only references I could find was in hashjoin.h:30:

/* [...]
* [...] (Exception: data associated with the temp files lives in the
* per-query context too, since we always call buffile.c in that context.)

And in nodeHashjoin.c:1243:ExecHashJoinSaveTuple() (I reworded this
original comment in the patch):

/* [...]
* Note: it is important always to call this in the regular executor
* context, not in a shorter-lived context; else the temp file buffers
* will get messed up.

But these are not explanation of why BufFile related allocations must be under
a per-query context.

2. Wrapping each call of ExecHashJoinSaveTuple() with a memory context switch
seems fragile as it could be forgotten in futur code path/changes. So I
added an Assert() in the function to make sure the current memory context is
"HashBatchFiles" as expected.
Another way to tie this up might be to pass the memory context as argument to
the function.
... Or maybe I'm over precautionary.

3. You wrote:

>> A separate BufFile memory context helps, although people won't see it
>> unless they attach a debugger, I think. Better than nothing, but I was
>> wondering if we could maybe print some warnings when the number of batch
>> files gets too high ...

So I added a WARNING when batches memory are exhausting the memory size
allowed.

+ if (hashtable->fileCxt->mem_allocated > hashtable->spaceAllowed)
+ elog(WARNING, "Growing number of hash batch is exhausting memory");

This is repeated on each call of ExecHashIncreaseNumBatches when BufFile
overflows the memory budget. I realize now I should probably add the memory
limit, the number of current batch and their memory consumption.
The message is probably too cryptic for a user. It could probably be
reworded, but some doc or additionnal hint around this message might help.

4. I left the debug messages for some more review rounds

Regards,

Attachment Content-Type Size
0001-Allocate-hash-batches-related-BufFile-in-a-dedicated.patch text/x-patch 6.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2023-03-27 21:18:41 Re: SQL/JSON revisited
Previous Message Peter Geoghegan 2023-03-27 20:51:38 Re: pgsql: amcheck: Fix verify_heapam for tuples where xmin or xmax is 0.