Re: Memory leak from ExecutorState context?

From: Jehan-Guillaume de Rorthais <jgdr(at)dalibo(dot)com>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Konstantin Knizhnik <knizhnik(at)garret(dot)ru>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: Memory leak from ExecutorState context?
Date: 2023-05-17 22:35:29
Message-ID: 20230518003529.7b439d00@karst
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 17 May 2023 13:46:35 -0400
Melanie Plageman <melanieplageman(at)gmail(dot)com> wrote:

> On Wed, May 17, 2023 at 07:10:08PM +0200, Jehan-Guillaume de Rorthais wrote:
> > On Tue, 16 May 2023 16:00:52 -0400
> > Melanie Plageman <melanieplageman(at)gmail(dot)com> wrote:
> > > ...
> > > There are some existing indentation issues in these files, but you can
> > > leave those or put them in a separate commit.
> >
> > Reindented in v9.
> >
> > I put existing indentation issues in a separate commit to keep the actual
> > patches clean from distractions.
>
> It is a matter of opinion, but I tend to prefer the commit with the fix
> for the existing indentation issues to be first in the patch set.

OK. moved in v10 patch set.

> ...
> > > > void
> > > > ExecHashJoinSaveTuple(MinimalTuple tuple, uint32 hashvalue,
> > > > - BufFile **fileptr)
> > > > + BufFile **fileptr,
> > > > MemoryContext cxt) {
> >
> > Note that I changed this to:
> >
> > ExecHashJoinSaveTuple(MinimalTuple tuple, uint32 hashvalue,
> > BufFile **fileptr, HashJoinTable hashtable) {
> >
> > As this function must allocate BufFile buffer in spillCxt, I suppose
> > we should force it explicitly in the function code.
> >
> > Plus, having hashtable locally could be useful for later patch to eg. fine
> > track allocated memory in spaceUsed.
>
> I think if you want to pass the hashtable instead of the memory context,
> I think you'll need to explain why you still pass the buffile pointer
> (because ExecHashJoinSaveTuple() is called for inner and outer batch
> files) instead of getting it from the hashtable's arrays of buffile
> pointers.

Comment added in v10

> > @@ -1310,22 +1311,38 @@ ExecParallelHashJoinNewBatch(HashJoinState *hjstate)
> ...
>
> Suggested small edit to the second paragraph:
>
> During the build phase, buffered files are created for inner batches.
> Each batch's buffered file is closed (and its buffer freed) after the
> batch is loaded into memory during the outer side scan. Therefore, it
> is necessary to allocate the batch file buffer in a memory context
> which outlives the batch itself.

Changed.

> I'd also mention the reason for passing the buffile pointer above the
> function.

Added.

Regards,

Attachment Content-Type Size
v10-0001-Run-pgindent-on-nodeHash.c-and-nodeHashjoin.c.patch text/x-patch 2.8 KB
v10-0002-Describe-hash-join-implementation.patch text/x-patch 3.0 KB
v10-0003-Dedicated-memory-context-for-hash-join-spill-buf.patch text/x-patch 13.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-05-17 22:48:26 Re: Introduce WAIT_EVENT_EXTENSION and WAIT_EVENT_BUFFER_PIN
Previous Message Kirk Wolak 2023-05-17 22:18:05 Re: Should CSV parsing be stricter about mid-field quotes?