Re: Add bump memory context type and use it for tuplesorts

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: John Naylor <johncnaylorls(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Add bump memory context type and use it for tuplesorts
Date: 2024-03-15 02:21:02
Message-ID: CAApHDvpWkPp4ep=Lm+5es_i1yyBmmMEDguvAE0fdGM8Pfpmigg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 12 Mar 2024 at 23:57, Tomas Vondra
<tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
> Attached is an updated version of the mempool patch, modifying all the
> memory contexts (not just AllocSet), including the bump context. And
> then also PDF with results from the two machines, comparing results
> without and with the mempool. There's very little impact on small reset
> values (128kB, 1MB), but pretty massive improvements on the 8MB test
> (where it's a 2x improvement).

I think it would be good to have something like this. I've done some
experiments before with something like this [1]. However, mine was
much more trivial.

One thing my version did was get rid of the *context* freelist stuff
in aset. I wondered if we'd need that anymore as, if I understand
correctly, it's just there to stop malloc/free thrashing, which is
what the patch aims to do anyway. Aside from that, it's now a little
weird that aset.c has that but generation.c and slab.c do not.

One thing I found was that in btbeginscan(), we have "so =
(BTScanOpaque) palloc(sizeof(BTScanOpaqueData));", which on this
machine is 27344 bytes and results in a call to AllocSetAllocLarge()
and therefore a malloc(). Ideally, there'd be no malloc() calls in a
standard pgbench run, at least once the rel and cat caches have been
warmed up.

I think there are a few things in your patch that could be improved,
here's a quick review.

1. MemoryPoolEntryIndex() could follow the lead of
AllocSetFreeIndex(), which is quite well-tuned and has no looping. I
think you can get rid of MemoryPoolEntrySize() and just have
MemoryPoolEntryIndex() round up to the next power of 2.

2. The following could use "result = Min(MEMPOOL_MIN_BLOCK,
pg_nextpower2_size_t(size));"

+ * should be very low, though (less than MEMPOOL_SIZES, i.e. 14).
+ */
+ result = MEMPOOL_MIN_BLOCK;
+ while (size > result)
+ result *= 2;

3. "MemoryPoolFree": I wonder if this is a good name for such a
function. Really you want to return it to the pool. "Free" sounds
like you're going to free() it. I went for "Fetch" and "Release"
which I thought was less confusing.

4. MemoryPoolRealloc(), could this just do nothing if the old and new
indexes are the same?

5. It might be good to put a likely() around this:

+ /* only do this once every MEMPOOL_REBALANCE_DISTANCE allocations */
+ if (pool->num_requests < MEMPOOL_REBALANCE_DISTANCE)
+ return;

Otherwise, if that function is inlined then you'll bloat the functions
that inline it for not much good reason. Another approach would be to
have a static inline function which checks and calls a noinline
function that does the work so that the rebalance stuff is never
inlined.

Overall, I wonder if the rebalance stuff might make performance
testing quite tricky. I see:

+/*
+ * How often to rebalance the memory pool buckets (number of allocations).
+ * This is a tradeoff between the pool being adaptive and more overhead.
+ */
+#define MEMPOOL_REBALANCE_DISTANCE 25000

Will TPS take a sudden jump after 25k transactions doing the same
thing? I'm not saying this shouldn't happen, but... benchmarking is
pretty hard already. I wonder if there's something more fine-grained
that can be done which makes the pool adapt faster but not all at
once. (I've not studied your algorithm for the rebalance.)

David

[1] https://github.com/david-rowley/postgres/tree/malloccache

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2024-03-15 02:27:26 Re: Inconsistent printf placeholders
Previous Message Nathan Bossart 2024-03-15 01:52:55 Re: cleanup patches for incremental backup