Re: slab allocator performance issues

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org, Tomas Vondra <tv(at)fuzzy(dot)cz>
Subject: Re: slab allocator performance issues
Date: 2022-10-12 09:37:17
Message-ID: CAApHDvoxVxFN0DXYyn6tDdg6s7wx2sVrVJ_JSCZxrfd-s86j8Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, 11 Sept 2021 at 09:07, Tomas Vondra
<tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
> I've been investigating the regressions in some of the benchmark
> results, together with the generation context benchmarks [1].

I've not looked into the regression you found with this yet, but I did
rebase the patch. slab.c has seen quite a number of changes recently.

I didn't spend a lot of time checking over the patch. I mainly wanted
to see what the performance was like before reviewing in too much
detail.

To test the performance, I used [1] and ran:

select pg_allocate_memory_test(<nbytes>, 1024*1024,
10::bigint*1024*1024*1024, 'slab');

that basically allocates chunks of <nbytes> and keeps around 1MB of
them at a time and allocates a total of 10GBs of them.

I saw:

Master:
16 byte chunk = 8754.678 ms
32 byte chunk = 4511.725 ms
64 byte chunk = 2244.885 ms
128 byte chunk = 1135.349 ms
256 byte chunk = 548.030 ms
512 byte chunk = 272.017 ms
1024 byte chunk = 144.618 ms

Master + attached patch:
16 byte chunk = 5255.974 ms
32 byte chunk = 2640.807 ms
64 byte chunk = 1328.949 ms
128 byte chunk = 668.078 ms
256 byte chunk = 330.564 ms
512 byte chunk = 166.844 ms
1024 byte chunk = 85.399 ms

So patched runs in about 60% of the time that master runs in.

I plan to look at the patch in a bit more detail and see if I can
recreate and figure out the regression that Tomas reported. For now, I
just want to share the rebased patch.

The only thing I really adjusted from Andres' version is to instead of
using pointers for the linked list block freelist, I made it store the
number of bytes into the block that the chunk is. This means we can
use 4 bytes instead of 8 bytes for these pointers. The block size is
limited to 1GB now anyway, so 32-bit is large enough for these
offsets.

David

[1] https://www.postgresql.org/message-id/attachment/137056/allocate_performance_functions.patch.txt

Attachment Content-Type Size
v3-0001-WIP-slab-performance.patch text/plain 23.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2022-10-12 10:10:47 Re: Perform streaming logical transactions by background workers and parallel apply
Previous Message Yugo NAGATA 2022-10-12 09:27:04 Re: make_ctags: use -I option to ignore pg_node_attr macro