Re: Optimize planner memory consumption for huge arrays

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Lepikhov Andrei <a(dot)lepikhov(at)postgrespro(dot)ru>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Евгений Бредня <e(dot)brednya(at)postgrespro(dot)ru>
Subject: Re: Optimize planner memory consumption for huge arrays
Date: 2024-02-19 17:37:49
Message-ID: 4b9bf4c8-b2a4-4e69-b793-7253dffc8baa@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2/19/24 16:45, Tom Lane wrote:
> Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> writes:
>> Considering there are now multiple patches improving memory usage during
>> planning with partitions, perhaps it's time to take a step back and
>> think about how we manage (or rather not manage) memory during query
>> planning, and see if we could improve that instead of an infinite
>> sequence of ad hoc patches?
>
> +1, I've been getting an itchy feeling about that too. I don't have
> any concrete proposals ATM, but I quite like your idea here:
>
>> For example, I don't think we expect selectivity functions to allocate
>> long-lived objects, right? So maybe we could run them in a dedicated
>> memory context, and reset it aggressively (after each call).
>
> That could eliminate a whole lot of potential leaks. I'm not sure
> though how much it moves the needle in terms of overall planner
> memory consumption.

I'm not sure about that either, maybe not much - for example it would
not help with the two other memory usage patches (which are related to
SpecialJoinInfo and RestrictInfo, outside selectivity functions).

It was an ad hoc thought, inspired by the issue at hand. Maybe it would
be possible to find similar "boundaries" in other parts of the planner.

I keep thinking about how compilers/optimizers typically have separate
optimizations passes, maybe that's something we might leverage ...

> I've always supposed that the big problem was data structures
> associated with rejected Paths, but I might be wrong. Is there some
> simple way we could get a handle on where the most memory goes while
> planning?
>

I suspect this might have changed thanks to partitioning - it's not a
coincidence most of the recent memory usage improvements address cases
with many partitions.

As for how to analyze the memory usage - maybe there's a simpler way,
but what I did recently was adding simple instrumentation into memory
contexts, recording pointer/size/backtrace for each request, and then a
script that aggregated that into a "currently allocated" report with
information about "source" of the allocation.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2024-02-19 18:13:09 Re: PGC_SIGHUP shared_buffers?
Previous Message Tom Lane 2024-02-19 17:22:33 Re: BUG #18348: Inconsistency with EXTRACT([field] from INTERVAL);