Re: Memory leak from ExecutorState context?

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>, Jehan-Guillaume de Rorthais <jgdr(at)dalibo(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: Memory leak from ExecutorState context?
Date: 2023-03-23 18:49:43
Message-ID: 1fc1f116-b235-f016-a1f8-f124a9840a5c@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 3/23/23 13:07, Melanie Plageman wrote:
> On Fri, Mar 10, 2023 at 1:51 PM Jehan-Guillaume de Rorthais
> <jgdr(at)dalibo(dot)com> wrote:
>>> So I guess the best thing would be to go through these threads, see what
>>> the status is, restart the discussion and propose what to do. If you do
>>> that, I'm happy to rebase the patches, and maybe see if I could improve
>>> them in some way.
>>
>> OK! It took me some time, but I did it. I'll try to sum up the situation as
>> simply as possible.
>
> Wow, so many memories!
>
> I'm excited that someone looked at this old work (though it is sad that
> a customer faced this issue). And, Jehan, I really appreciate your great
> summarization of all these threads. This will be a useful reference.
> >> 1. "move BufFile stuff into separate context"
>> last found patch: 2019-04-21
>> https://www.postgresql.org/message-id/20190421114618.z3mpgmimc3rmubi4%40development
>> https://www.postgresql.org/message-id/attachment/100822/0001-move-BufFile-stuff-into-separate-context.patch
>>
>> This patch helps with observability/debug by allocating the bufFiles in the
>> appropriate context instead of the "ExecutorState" one.
>>
>> I suppose this simple one has been forgotten in the fog of all other
>> discussions. Also, this probably worth to be backpatched.
>
> I agree with Jehan-Guillaume and Tomas that this seems fine to commit
> alone.
>

+1 to that. I think the separate memory contexts would be
non-controversial for backpatching too.

> Tomas:
>> Yeah, I think this is something we'd want to do. It doesn't change the
>> behavior, but it makes it easier to track the memory consumption etc.
>
>> 2. "account for size of BatchFile structure in hashJoin"
>> last found patch: 2019-04-22
>> https://www.postgresql.org/message-id/20190428141901.5dsbge2ka3rxmpk6%40development
>> https://www.postgresql.org/message-id/attachment/100951/v2-simple-rebalance.patch
>>
>> This patch seems like a good first step:
>>
>> * it definitely helps older versions where other patches discussed are way
>> too invasive to be backpatched
>> * it doesn't step on the way of other discussed patches
>>
>> While looking at the discussions around this patch, I was wondering if the
>> planner considers the memory allocation of bufFiles. But of course, Melanie
>> already noticed that long before I was aware of this problem and discussion:
>>
>> 2019-07-10: «I do think that accounting for Buffile overhead when estimating
>> the size of the hashtable during ExecChooseHashTableSize() so it can be
>> used during planning is a worthwhile patch by itself (though I know it
>> is not even part of this patch).»
>> https://www.postgresql.org/message-id/CAAKRu_Yiam-%3D06L%2BR8FR%2BVaceb-ozQzzMqRiY2pDYku1VdZ%3DEw%40mail.gmail.com
>>
>> Tomas Vondra agreed with this in his answer, but no new version of the patch
>> where produced.
>>
>> Finally, Melanie was pushing the idea to commit this patch no matter other
>> pending patches/ideas:
>>
>> 2019-09-05: «If Tomas or someone else has time to pick up and modify BufFile
>> accounting patch, committing that still seems like the nest logical
>> step.»
>> https://www.postgresql.org/message-id/CAAKRu_b6%2BjC93WP%2BpWxqK5KAZJC5Rmxm8uquKtEf-KQ%2B%2B1Li6Q%40mail.gmail.com
>
> I think I would have to see a modern version of a patch which does this
> to assess if it makes sense. But, I probably still agree with 2019
> Melanie :)
> Overall, I think anything that makes it faster to identify customer
> cases of this bug is good (which, I would think granular memory contexts
> and accounting would do). Even if it doesn't fix it, we can determine
> more easily how often customers are hitting this issue, which helps
> justify an admittedly large design change to hash join.
>

Agreed. This issue is quite rare (we only get a report once a year or
two), just enough to forget about it and have to rediscover it's there.
So having something that clearly identifies it would be good.

A separate BufFile memory context helps, although people won't see it
unless they attach a debugger, I think. Better than nothing, but I was
wondering if we could maybe print some warnings when the number of batch
files gets too high ...

> On Mon, Mar 20, 2023 at 10:12 AM Jehan-Guillaume de Rorthais
> <jgdr(at)dalibo(dot)com> wrote:
>> BNJL and/or other considerations are for 17 or even after. In the meantime,
>> Melanie, who authored BNLJ, +1 the balancing patch as it can coexists with other
>> discussed solutions. No one down vote since then. Melanie, what is your
>> opinion today on this patch? Did you change your mind as you worked for many
>> months on BNLJ since then?
>
> So, in order to avoid deadlock, my design of adaptive hash join/block
> nested loop hash join required a new parallelism concept not yet in
> Postgres at the time -- the idea of a lone worker remaining around to do
> work when others have left.
>
> See: BarrierArriveAndDetachExceptLast()
> introduced in 7888b09994
>
> Thomas Munro had suggested we needed to battle test this concept in a
> more straightforward feature first, so I implemented parallel full outer
> hash join and parallel right outer hash join with it.
>
> https://commitfest.postgresql.org/42/2903/
>
> This has been stalled ready-for-committer for two years. It happened to
> change timing such that it made an existing rarely hit parallel hash
> join bug more likely to be hit. Thomas recently committed our fix for
> this in 8d578b9b2e37a4d (last week). It is my great hope that parallel
> full outer hash join goes in before the 16 feature freeze.
>

Good to hear this is moving forward.

> If it does, I think it could make sense to try and find committable
> smaller pieces of the adaptive hash join work. As it is today, parallel
> hash join does not respect work_mem, and, in some sense, is a bit broken.
>
> I would be happy to work on this feature again, or, if you were
> interested in picking it up, to provide review and any help I can if for
> you to work on it.
>

I'm no expert in parallel hashjoin expert, but I'm willing to take a
look a rebased patch. I'd however recommend breaking the patch into
smaller pieces - the last version I see in the thread is ~250kB, which
is rather daunting, and difficult to review without interruption. (I
don't remember the details of the patch, so maybe that's not possible
for some reason.)

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2023-03-23 18:54:14 Re: Schema variables - new implementation for Postgres 15
Previous Message Gregory Stark (as CFM) 2023-03-23 18:42:51 Re: Consider parallel for lateral subqueries with limit