Re: Memory consumed by child SpecialJoinInfo in partitionwise join planning

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Richard Guo <guofenglinux(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Memory consumed by child SpecialJoinInfo in partitionwise join planning
Date: 2024-02-18 17:25:10
Message-ID: 73c30ca6-1bd9-4a0b-9776-b7d440c0d7cc@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I took a quick look at this patch today. I certainly agree with the
intent to reduce the amount of memory during planning, assuming it's not
overly disruptive. And I think this patch is fairly localized and looks
sensible.

That being said I'm a big fan of using a local variable on stack and
filling it. I'd probably go with the usual palloc/pfree, because that
makes it much easier to use - the callers would not be responsible for
allocating the SpecialJoinInfo struct. Sure, it's a little bit of
overhead, but with the AllocSet caching I doubt it's measurable.

I did put this through check-world on amd64/arm64, with valgrind,
without any issue. I also tried the scripts shared by Ashutosh in his
initial message (with some minor fixes, adding MEMORY to explain etc).

The results with the 20240130 patches are like this:

tables master patched
-----------------------------
2 40.8 39.9
3 151.7 142.6
4 464.0 418.5
5 1663.9 1419.5

That's certainly a nice improvement, and it even reduces the amount of
time for planning (the 5-table join goes from 18s to 17s on my laptop).
That's nice, although 17 seconds for planning is not ... great.

That being said, the amount of remaining memory needed by planning is
still pretty high - we save ~240MB for a join of 5 tables, but we still
need ~1.4GB. Yes, this is a bit extreme example, and it probably is not
very common to join 5 tables with 1000 partitions each ...

Do we know what are the other places consuming the 1.4GB of memory?
Considering my recent thread about scalability, where malloc() turned
out to be one of the main culprits, I wonder if maybe there's a lot to
gain by reducing the memory usage ... Our attitude to memory usage is
that it doesn't really matter if we keep it allocated for a bit, because
we'll free it shortly. And that may be true for "modest" memory usage,
but with 1.4GB that doesn't seem great, and the malloc overhead can be
pretty bad.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Lakhin 2024-02-18 18:00:00 Re: Removing unneeded self joins
Previous Message Tomas Vondra 2024-02-18 16:56:58 Re: Thoughts about NUM_BUFFER_PARTITIONS