Re: Reducing the chunk header sizes on all memory context types

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Reducing the chunk header sizes on all memory context types
Date: 2022-08-09 12:53:04
Message-ID: CAApHDvrrYfcCXfuc_bZ0xsqBP8U62Y0i27agr9Qt-2geE_rv0Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 13 Jul 2022 at 17:20, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> I did consider that in all cases where the allocations are above
> allocChunkLimit that the chunk is put on a dedicated block and in
> fact, the blockoffset is always the same for those. I wondered if we
> could use the full 60 bits for the chunksize for those cases. The
> reason I didn't pursue that is because:

As it turns out, we don't really need to explicitly store the chunk
size for chunks which are on dedicated blocks. We can just calculate
this by subtracting the pointer to the memory from the block's endptr.
The block offset is always fixed too, like I mentioned above.

I've now revised the patch to completely get rid of the concept of
"large" chunks and instead memory chunks are always 8 bytes in size.
I've created a struct to this effect and named it MemoryChunk. All
memory contexts make use of this new struct. The header bit which I
was previously using to denote a "large" chunk now marks if the chunk
is "external", meaning that the MemoryChunk does not have knowledge of
the chunk size and/or block offset. The MemoryContext itself is
expected to manage this when the chunk is external. I've coded up
aset.c and generation.c to always use these external chunks when size
> set->allocChunkLimit. There is now a coding pattern like the
following (taken from AllocSetRealloc:

if (MemoryChunkIsExternal(chunk))
{
block = ExternalChunkGetBlock(chunk);
oldsize = block->endptr - (char *) pointer;
}
else
{
block = MemoryChunkGetBlock(chunk);
oldsize = MemoryChunkGetSize(chunk);
}

Here the ExternalChunkGetBlock() macro is just subtracting the
ALLOC_BLOCKHDRSZ from the chunk pointer to obtain the block pointer,
whereas MemoryChunkGetBlock() is subtracting the blockoffset as is
stored in the 30-bits of the chunk header.

Andres and I had a conversation off-list about the storage of freelist
pointers. Currently I have these stored in the actual allocated
memory. The minimum allocation size is 8-bytes, which is big enough
for storing sizeof(void *). Andres suggested that it might be safer to
store this link at the end of the chunk rather than at the start.
I've not quite done that in the attached, but doing this should just
be a matter of adjusting the GetFreeListLink() macro to add the
chunksize - sizeof(AllocFreelistLink).

I did a little bit of benchmarking of the attached with a scale 1 pgbench.

master = 0b039e3a8

master
drowley(at)amd3990x:~$ pgbench -c 156 -j 156 -T 60 -S postgres
tps = 1638436.367793 (without initial connection time)
tps = 1652688.009579 (without initial connection time)
tps = 1671281.281856 (without initial connection time)

master + v2 patch
drowley(at)amd3990x:~$ pgbench -c 156 -j 156 -T 60 -S postgres
tps = 1825346.734486 (without initial connection time)
tps = 1824539.294142 (without initial connection time)
tps = 1807359.758822 (without initial connection time)

~10% faster.

There are a couple of things to note that might require discussion:

1. I've added a new restriction that block sizes cannot be above 1GB.
This is because the 30-bits in the MemoryChunk used for storing the
offset between the chunk and the block wouldn't be able to store the
offset if the chunk was offset more than 1GB from the block. I used
debian code search to see if I could find any user code that used
blocks this big. I found nothing.
2. slab.c has a new restriction that the chunk size cannot be >= 1GB.
I'm not at all concerned about this. I think if someone needed chunks
that big there'd be no benefits from slab context over an aset
context.
3. As mentioned above, aset.c now stores freelist pointers in the
allocated chunk's memory. This allows us to get the header down to 8
bytes instead of today's 16 bytes. There's an increased danger that
buggy code that writes to memory after a pfree could stomp on this.

I think the patch is now starting to take shape. I've added it to the
September commitfest [1].

David

[1] https://commitfest.postgresql.org/39/3810/

Attachment Content-Type Size
v2-0001-Improve-performance-of-and-reduce-overheads-of-me.patch text/plain 93.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Önder Kalacı 2022-08-09 12:55:06 Materialized view rewrite is broken when there is an event trigger
Previous Message Andrew Dunstan 2022-08-09 12:37:16 Re: [RFC] building postgres with meson