Re: AW: BUG #15923: Prepared statements take way too much memory.

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Daniel Migowski <dmigowski(at)ikoffice(dot)de>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: AW: BUG #15923: Prepared statements take way too much memory.
Date: 2019-07-25 22:55:35
Message-ID: CA+hUKGJjRVsDS6x1vcvjzjC3xsswzOWJSFOmWUgbHwSJeYoRdw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Fri, Jul 26, 2019 at 10:13 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> FWIW, I've thought for some time that we should invent a memory context
> allocator that's meant for data that doesn't get realloc'd (much) after
> first allocation, with lower overhead than aset.c has. Such an allocator
> would be ideal for plancache.c, and perhaps other use-cases such as
> plpgsql function parsetrees. IMV this would have these properties:
>
> * Doesn't support retail pfree; to recover space you must destroy the
> whole context. We could just make pfree a no-op. With the details
> sketched below, repalloc would have to throw an error (because it would
> not know the size of the old chunk), but I think that's OK for the
> intended purpose.

Yeah, I wondered the same thing (in a discussion about shared plan
caches, which are pretty hypothetical at this stage but related to the
OP's complaint):

https://www.postgresql.org/message-id/CA%2BhUKGLOivzO5KuBFP27_jYWi%2B_8ki-Y%2BrdXXJZ4kmqF4kae%2BQ%40mail.gmail.com

> A totally different idea is to make a variant version of copyObject
> that is intended to produce a compact form of a node tree, and does
> not create a separate palloc allocation for each node but just packs
> them as tightly as it can in larger palloc chunks. This could outperform
> the no-pfree-context idea because it wouldn't need even context-pointer
> overhead for each node. (This relies on node trees being read-only to
> whatever code is looking at them, which should be OK for finished plan
> trees that are copied into the plancache; otherwise somebody might think
> they could apply repalloc, GetMemoryChunkContext, etc to the nodes,
> which'd crash.) The stumbling block here is that nobody is gonna
> tolerate maintaining two versions of copyfuncs.c, so you'd have to
> find a way for a single set of copy functions to support this output
> format as well as the traditional one. (Alternatively, maybe we
> could learn to autogenerate the copy functions from annotated struct
> definitions; people have muttered about that for years but not done
> anything.)

Well, if you had a rip cord type allocator that just advances a
pointer through a chain of chunks and can only free all at once, you'd
get about the same layout anyway with the regular copy functions
anyway, without extra complexity. IIUC the only advantage of a
special compacting copyObject() would be that it could presumably get
precisely the right sized single piece of memory, but is that worth
the complexity of a two-phase design (IIUC you'd want recursive
tell-me-how-much-memory-you'll-need, then allocate all at once)? A
chain of smaller chunks should be nearly as good (just a small amount
of wasted space in the final chunk).

(Auto-generated read/write/copy/eq functions would be a good idea anyway.)

>> References to data like table and datatype definitions that are copied
>> into the plan but are also copied multiple times. Doesn't matter for
>> ...
> TBH that sounds like a dead end to me. Detecting duplicate subtrees would
> be astonishingly expensive, and I doubt that there will be enough of them
> to really repay the effort in typical queries.

Yeah, deduplicating common subplans may be too tricky and expensive.
I don't think it's that crazy to want to be able to share a whole
plans between backends though. Admittedly a lofty and far-off goal,
but I think we can get there eventually. Having the same set of N
explicitly prepared plans in M pooled backend sessions is fairly
common.

--
Thomas Munro
https://enterprisedb.com

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Michael Paquier 2019-07-26 00:34:13 Re: REINDEX CONCURRENTLY causes ALTER TABLE to fail
Previous Message Tom Lane 2019-07-25 22:13:33 Re: AW: BUG #15923: Prepared statements take way too much memory.