Re: Memory leak from ExecutorState context?

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Konstantin Knizhnik <knizhnik(at)garret(dot)ru>
Cc: Jehan-Guillaume de Rorthais <jgdr(at)dalibo(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Memory leak from ExecutorState context?
Date: 2023-04-20 22:51:19
Message-ID: CAAKRu_ZqOGePK6AqMisyQmUyc8vC1kz5NQHe6B4jKg0NpB_Wvg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Apr 20, 2023 at 12:42 PM Konstantin Knizhnik <knizhnik(at)garret(dot)ru> wrote:
> On 11.04.2023 8:14 PM, Jehan-Guillaume de Rorthais wrote:
> > On Sat, 8 Apr 2023 02:01:19 +0200
> > Jehan-Guillaume de Rorthais <jgdr(at)dalibo(dot)com> wrote:
> >
> >> On Fri, 31 Mar 2023 14:06:11 +0200
> >> Jehan-Guillaume de Rorthais <jgdr(at)dalibo(dot)com> wrote:
> >>
> >> [...]
> >>
> >> After rebasing Tomas' memory balancing patch, I did some memory measures
> >> to answer some of my questions. Please, find in attachment the resulting
> >> charts "HJ-HEAD.png" and "balancing-v3.png" to compare memory consumption
> >> between HEAD and Tomas' patch. They shows an alternance of numbers
> >> before/after calling ExecHashIncreaseNumBatches (see the debug patch). I
> >> didn't try to find the exact last total peak of memory consumption during the
> >> join phase and before all the BufFiles are destroyed. So the last number
> >> might be underestimated.
> > I did some more analysis about the total memory consumption in filecxt of HEAD,
> > v3 and v4 patches. My previous debug numbers only prints memory metrics during
> > batch increments or hash table destruction. That means:
> >
> > * for HEAD: we miss the batches consumed during the outer scan
> > * for v3: adds twice nbatch in spaceUsed, which is a rough estimation
> > * for v4: batches are tracked in spaceUsed, so they are reflected in spacePeak
> >
> > Using a breakpoint in ExecHashJoinSaveTuple to print "filecxt->mem_allocated"
> > from there, here are the maximum allocated memory for bufFile context for each
> > branch:
> >
> > batches max bufFiles total spaceAllowed rise
> > HEAD 16384 199966960 ~194MB
> > v3 4096 65419456 ~78MB
> > v4(*3) 2048 34273280 48MB nbatch*sizeof(PGAlignedBlock)*3
> > v4(*4) 1024 17170160 60.6MB nbatch*sizeof(PGAlignedBlock)*4
> > v4(*5) 2048 34273280 42.5MB nbatch*sizeof(PGAlignedBlock)*5
> >
> > It seems account for bufFile in spaceUsed allows a better memory balancing and
> > management. The precise factor to rise spaceAllowed is yet to be defined. *3 or
> > *4 looks good, but this is based on a single artificial test case.
> >
> > Also, note that HEAD is currently reporting ~4MB of memory usage. This is by
> > far wrong with the reality. So even if we don't commit the balancing memory
> > patch in v16, maybe we could account for filecxt in spaceUsed as a bugfix?
> >
> > Regards,
>
> Thank you for the patch.
> I faced with the same problem (OOM caused by hash join).
> I tried to create simplest test reproducing the problem:
>
> create table t(pk int, val int);
> insert into t values (generate_series(1,100000000),0);
> set work_mem='64kB';
> explain (analyze,buffers) select count(*) from t t1 join t t2 on
> (t1.pk=t2.pk);
>
>
> There are three workers and size of each exceeds 1.3Gb.
>
> Plan is the following:
>
> Finalize Aggregate (cost=355905977972.87..355905977972.88 rows=1
> width=8) (actual time=2
> 12961.033..226097.513 rows=1 loops=1)
> Buffers: shared hit=32644 read=852474 dirtied=437947 written=426374,
> temp read=944407 w
> ritten=1130380
> -> Gather (cost=355905977972.65..355905977972.86 rows=2 width=8)
> (actual time=212943.
> 505..226097.497 rows=3 loops=1)
> Workers Planned: 2
> Workers Launched: 2
> Buffers: shared hit=32644 read=852474 dirtied=437947
> written=426374, temp read=94
> 4407 written=1130380
> -> Partial Aggregate (cost=355905976972.65..355905976972.66
> rows=1 width=8) (ac
> tual time=212938.410..212940.035 rows=1 loops=3)
> Buffers: shared hit=32644 read=852474 dirtied=437947
> written=426374, temp r
> ead=944407 written=1130380
> -> Parallel Hash Join (cost=1542739.26..303822614472.65
> rows=20833345000002 width=0) (actual time=163268.274..207829.524
> rows=33333333 loops=3)
> Hash Cond: (t1.pk = t2.pk)
> Buffers: shared hit=32644 read=852474
> dirtied=437947 written=426374, temp read=944407 written=1130380
> -> Parallel Seq Scan on t t1
> (cost=0.00..859144.78 rows=41666678 width=4) (actual
> time=0.045..30828.051 rows=33333333 loops=3)
> Buffers: shared hit=16389 read=426089 written=87
> -> Parallel Hash (cost=859144.78..859144.78
> rows=41666678 width=4) (actual time=82202.445..82202.447 rows=33333333
> loops=3)
> Buckets: 4096 (originally 4096) Batches:
> 32768 (originally 8192) Memory Usage: 192kB
> Buffers: shared hit=16095 read=426383
> dirtied=437947 written=426287, temp read=267898 written=737164
> -> Parallel Seq Scan on t t2
> (cost=0.00..859144.78 rows=41666678 width=4) (actual
> time=0.054..12647.534 rows=33333333 loops=3)
> Buffers: shared hit=16095 read=426383
> dirtied=437947 writ
> ten=426287
> Planning:
> Buffers: shared hit=69 read=38
> Planning Time: 2.819 ms
> Execution Time: 226113.292 ms
> (22 rows)
>
>
>
> -----------------------------
>
> So we have increased number of batches to 32k.
> I applied your patches 0001-0004 but unfortunately them have not reduced
> memory consumption - still size of each backend is more than 1.3Gb.

Is this EXPLAIN ANALYZE run on an instance with Jehan-Guillaume's
patchset applied or without?

I'm asking because the fourth patch in the series updates spaceUsed with
the size of the BufFile->buffer, but I notice in your EXPLAIN ANALZYE,
Memory Usage for the hashtable is reported as 192 kB, which, while
larger than the 64kB work_mem you set, isn't as large as I might expect.

- Melanie

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2023-04-20 22:51:54 Re: Should we remove vacuum_defer_cleanup_age?
Previous Message David Rowley 2023-04-20 22:49:42 Re: Fix typos and inconsistencies for v16