From: | David Rowley <dgrowleyml(at)gmail(dot)com> |
---|---|
To: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org, Tomas Vondra <tv(at)fuzzy(dot)cz> |
Subject: | Re: slab allocator performance issues |
Date: | 2022-10-12 09:37:17 |
Message-ID: | CAApHDvoxVxFN0DXYyn6tDdg6s7wx2sVrVJ_JSCZxrfd-s86j8Q@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, 11 Sept 2021 at 09:07, Tomas Vondra
<tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
> I've been investigating the regressions in some of the benchmark
> results, together with the generation context benchmarks [1].
I've not looked into the regression you found with this yet, but I did
rebase the patch. slab.c has seen quite a number of changes recently.
I didn't spend a lot of time checking over the patch. I mainly wanted
to see what the performance was like before reviewing in too much
detail.
To test the performance, I used [1] and ran:
select pg_allocate_memory_test(<nbytes>, 1024*1024,
10::bigint*1024*1024*1024, 'slab');
that basically allocates chunks of <nbytes> and keeps around 1MB of
them at a time and allocates a total of 10GBs of them.
I saw:
Master:
16 byte chunk = 8754.678 ms
32 byte chunk = 4511.725 ms
64 byte chunk = 2244.885 ms
128 byte chunk = 1135.349 ms
256 byte chunk = 548.030 ms
512 byte chunk = 272.017 ms
1024 byte chunk = 144.618 ms
Master + attached patch:
16 byte chunk = 5255.974 ms
32 byte chunk = 2640.807 ms
64 byte chunk = 1328.949 ms
128 byte chunk = 668.078 ms
256 byte chunk = 330.564 ms
512 byte chunk = 166.844 ms
1024 byte chunk = 85.399 ms
So patched runs in about 60% of the time that master runs in.
I plan to look at the patch in a bit more detail and see if I can
recreate and figure out the regression that Tomas reported. For now, I
just want to share the rebased patch.
The only thing I really adjusted from Andres' version is to instead of
using pointers for the linked list block freelist, I made it store the
number of bytes into the block that the chunk is. This means we can
use 4 bytes instead of 8 bytes for these pointers. The block size is
limited to 1GB now anyway, so 32-bit is large enough for these
offsets.
David
[1] https://www.postgresql.org/message-id/attachment/137056/allocate_performance_functions.patch.txt
Attachment | Content-Type | Size |
---|---|---|
v3-0001-WIP-slab-performance.patch | text/plain | 23.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Smith | 2022-10-12 10:10:47 | Re: Perform streaming logical transactions by background workers and parallel apply |
Previous Message | Yugo NAGATA | 2022-10-12 09:27:04 | Re: make_ctags: use -I option to ignore pg_node_attr macro |