Re: BUG #15923: Prepared statements take way too much memory.

From: Andres Freund <andres(at)anarazel(dot)de>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Daniel Migowski <dmigowski(at)ikoffice(dot)de>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #15923: Prepared statements take way too much memory.
Date: 2019-07-26 02:16:36
Message-ID: 20190726021636.us353amvlci4vlpn@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi,

On 2019-07-26 10:55:35 +1200, Thomas Munro wrote:
> IIUC the only advantage of a
> special compacting copyObject() would be that it could presumably get
> precisely the right sized single piece of memory, but is that worth
> the complexity of a two-phase design (IIUC you'd want recursive
> tell-me-how-much-memory-you'll-need, then allocate all at once)? A
> chain of smaller chunks should be nearly as good (just a small amount
> of wasted space in the final chunk).

I was wondering if we somehow could have a 'header' for Node trees,
which'd keep track of their size when built incrementally. But it seems
like the amount of changes that'd require would be far too
extensive. We, I think, would have to pass down additional pointers in
too many places, not to even talk about the difficulty of tracking the
size precisely during incremental changes.

Although *if* we had it, we could probably avoid a lot of unnecessary
tree copying, by having copy-on-write + reference counting logic.

This kind of thing would be a lot of easier to solve with C++ than C (or
well, just any modern-ish language)... We could have "smart" references
to points in the query tree that keep track both of where they are, and
what the head of the whole tree is. And the immutable flattened tree
they could just use offsets instead of pointers, and resolve those
offsets within the accessor.

> (Auto-generated read/write/copy/eq functions would be a good idea
> anyway.)

Indeed.

> >> References to data like table and datatype definitions that are copied
> >> into the plan but are also copied multiple times. Doesn't matter for
> >> ...
> > TBH that sounds like a dead end to me. Detecting duplicate subtrees would
> > be astonishingly expensive, and I doubt that there will be enough of them
> > to really repay the effort in typical queries.
>
> Yeah, deduplicating common subplans may be too tricky and expensive.

It'd sure be interesting to write a function that computes stats about
which types of nodes consume how much memory. Shouldn't be too hard.

I'm mildly suspecting that a surprisingly large overhead in complex
query trees comes from target lists that are the same in many parts of
the tree.

Greetings,

Andres Freund

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message 高 云龙 2019-07-26 03:56:17 Re: A function privilege problem
Previous Message Andres Freund 2019-07-26 01:04:10 Re: BUG #15923: Prepared statements take way too much memory.