Re: PATCH : Generational memory allocator (was PATCH: two slab-like memory allocators)

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PATCH : Generational memory allocator (was PATCH: two slab-like memory allocators)
Date: 2017-09-15 01:07:25
Message-ID: 5ef0d99b-d154-1222-684d-6ca435901c85@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09/14/2017 04:21 PM, Simon Riggs wrote:
> On 14 August 2017 at 01:35, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>> Hi,
>>
>> Attached is a rebased version of the Generational context, originally
>> submitted with SlabContext (which was already committed into Pg 10).
>>
>> The main change is that I've abandoned the pattern of defining a Data
>> structure and then a pointer typedef, i.e.
>>
>> typedef struct GenerationContextData { ... } GenerationContextData;
>> typedef struct GenerationContextData *GenerationContext;
>>
>> Now it's just
>>
>> typedef struct GenerationContext { ... } GenerationContext;
>>
>> mostly because SlabContext was committed like that, and because Andres was
>> complaining about this code pattern ;-)
>>
>> Otherwise the design is the same as repeatedly discussed before.
>>
>> To show that this is still valuable change (even after SlabContext and
>> adding doubly-linked list to AllocSet), I've repeated the test done by
>> Andres in [1] using the test case described in [2], that is
>>
>> -- generate data
>> SELECT COUNT(*) FROM (SELECT test1()
>> FROM generate_series(1, 50000)) foo;
>>
>> -- benchmark (measure time and VmPeak)
>> SELECT COUNT(*) FROM (SELECT *
>> FROM pg_logical_slot_get_changes('test', NULL,
>> NULL, 'include-xids', '0')) foo;
>>
>> with different values passed to the first step (instead of the 50000). The
>> VmPeak numbers look like this:
>>
>> N master patched
>> --------------------------------------
>> 100000 1155220 kB 361604 kB
>> 200000 2020668 kB 434060 kB
>> 300000 2890236 kB 502452 kB
>> 400000 3751592 kB 570816 kB
>> 500000 4621124 kB 639168 kB
>>
>> and the timing (on assert-enabled build):
>>
>> N master patched
>> --------------------------------------
>> 100000 1103.182 ms 412.734 ms
>> 200000 2216.711 ms 820.438 ms
>> 300000 3320.095 ms 1223.576 ms
>> 400000 4584.919 ms 1621.261 ms
>> 500000 5590.444 ms 2113.820 ms
>>
>> So it seems it's still a significant improvement, both in terms of memory
>> usage and timing. Admittedly, this is a single test, so ideas of other
>> useful test cases are welcome.
>
> This all looks good.
>
> What I think this needs is changes to
> src/backend/utils/mmgr/README
> which decribe the various options that now exist (normal?, slab) and
> will exist (generational)
>
> Don't really care about the name, as long as its clear when to use it
> and when not to use it.
>
> This description of generational seems wrong: "When the allocated
> chunks have similar lifespan, this works very well and is extremely
> cheap."
> They don't need the same lifespan they just need to be freed in groups
> and in the order they were allocated.
>

Agreed, should be described differently. What matters is (mostly) FIFO
pattern of the palloc/pfree requests, which allows us to release the memory.

>
> For this patch specifically, we need additional comments in
> reorderbuffer.c to describe the memory allocation pattern in that
> module so that it is clear that the choice of allocator is useful
> and appropriate, possibly with details of how that testing was
> performed, so it can be re-tested later or tested on a variety of
> platforms.
>

Including details about the testing into reorderbuffer.c does not seem
very attractive to me. I don't recall any other place describing the
tests in detail. Why not to discuss the details here, and then include a
link to this thread in the commit message?

In any case, I doubt anyone will repeat the testing on a variety of
platforms (and I don't have any such comprehensive test suite anyway).
What will likely happen is someone bumping into a poorly performing
corner case, we will analyze it and fix it as usual.

>
> Particularly in reorderbuffer, surely we will almost immediately
> reuse chunks again, so is it worth issuing free() and then malloc()
> again soon after? Does that cause additional overhead we could also
> avoid? Could we possibly keep the last/few free'd chunks around
> rather than re-malloc?
>

I haven't seen anything like that in tests I've done. The malloc/free
overhead is negligible thanks as our allocators significantly reduce the
number of calls to those functions. If we happen to run into such case,
it shouldn't be difficult to keep a few empty blocks. But perhaps we can
leave that as a future optimization.

>
> Seems like we should commit this soon.
>

Thanks.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2017-09-15 01:21:25 Re: Clarification in pg10's pgupgrade.html step 10 (upgrading standby servers)
Previous Message Doug Doole 2017-09-15 01:00:27 Re: Add Roman numeral conversion to to_number