Re: Copy data to DSA area

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: "Ideriha, Takeshi" <ideriha(dot)takeshi(at)jp(dot)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Copy data to DSA area
Date: 2018-11-12 20:45:28
Message-ID: CA+TgmoY=ybAB8i_b315cTXyD2FT0_oHWAGvLQNL289pph==d6A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Nov 8, 2018 at 9:05 PM Thomas Munro
<thomas(dot)munro(at)enterprisedb(dot)com> wrote:
> * I had some ideas about some kind of "allocation rollback" interface:
> you begin an "allocation transaction", allocate a bunch of stuff
> (perhaps indirectly, by calling some API that makes query plans or
> whatever and is totally unaware of this stuff). Then if there is an
> error, whatever was allocated so far is freed in the usual cleanup
> paths by a rollback that happens via the resource manager machinery.
> If you commit, then the allocation becomes permanent. Then you only
> commit stuff that you promise not to leak (perhaps stuff that has been
> added to a very carefully managed cluster-wide plan cache). I am not
> sure of the details, and this might be crazy...

Hmm, my first thought was that you were just reinventing memory
contexts, but it's really not quite the same thing, because you want
the allocations to end up owned by a long-lived context when you
succeed but a transient context when you fail. Still, if it weren't
for the fact that the memory context interface is hostile to dynamic
shared memory's map-this-anywhere semantics, I suspect we'd try to
find a way to make memory contexts fit the bill, maybe by reparenting
contexts or even individual allocations. You could imagine using the
sorts of union algorithms that are described in
https://en.wikipedia.org/wiki/Disjoint-set_data_structure to get very
low asymptotic complexity here.

I wonder if it's possible to rethink our current memory context
machinery so that it is not so DSM-hostile. At one point, I had the
idea of replacing the pointer in the chunk header with an
array-offset, which might also avoid repeatedly creating and
destroying AllocSetContext objects over and over at high speed. Or
you could come up with some intermediate idea: if the value there is
MAXALIGN'd, it's a pointer; if not, it's some other kind of identifier
that you have to go look up in a table someplace to find the real
context.

Part of the problem here is that, on the one hand, it's really useful
that all memory management in PostgreSQL currently uses a single
system: memory contexts. I'd be loathe to change that. On the other
hand, there are several different allocation patterns which have
noticeably different optimal strategies:

1. allocate for a while and then free everything at once => allocate
from a single slab
2. allocate and free for a while and then free anything that's left =>
allocate from multiple slabs, dividing allocations by size class
3. perform a short series of allocations, freeing everything if we hit
an error midway through => keep an allocation log, roll back by retail
frees
4. like (3) but with a long series of allocations => allocate from a
slab, free the whole slab on error

Our current AllocSetContext is not optimal for any of these
situations. It uses size classes, but they are rather coarse-grained,
which wastes a lot of memory. They also have big headers, which also
wastes a lot of memory -- unnecessarily, in the case where we don't
need to free anything until the end. The main advantage of
AllocSetAlloc is that it's very fast for small contexts while still
being able to support individual freeing individual allocations when
necessary, though not efficiently.

I'm rambling here, I guess...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Doty 2018-11-12 20:51:15 Re: libpq debug log
Previous Message Laurenz Albe 2018-11-12 20:26:53 Re: Libpq support to connect to standby server as priority