Re: Reducing the chunk header sizes on all memory context types

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Reducing the chunk header sizes on all memory context types
Date: 2022-08-29 15:27:02
Message-ID: fc3a36eb-0767-d915-cdbd-e531983dc1de@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 8/29/22 17:20, Tom Lane wrote:
> Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> writes:
>> I can reproduce it on my system (rpi4 running 32-bit raspbian).
>
> Yeah, more or less same as what I'm testing on.
>
> Seeing that the complaint is about pfree'ing a non-maxaligned
> ReorderBufferChange pointer, I tried adding
>
> diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
> index 89cf9f9389..dfa9b6c9ee 100644
> --- a/src/backend/replication/logical/reorderbuffer.c
> +++ b/src/backend/replication/logical/reorderbuffer.c
> @@ -472,6 +472,8 @@ ReorderBufferGetChange(ReorderBuffer *rb)
> change = (ReorderBufferChange *)
> MemoryContextAlloc(rb->change_context, sizeof(ReorderBufferChange));
>
> + Assert(change == (void *) MAXALIGN(change));
> +
> memset(change, 0, sizeof(ReorderBufferChange));
> return change;
> }
>
> and this failed!
>
> (gdb) f 3
> #3 0x003cb888 in ReorderBufferGetChange (rb=0x24ed820) at reorderbuffer.c:475
> 475 Assert(change == (void *) MAXALIGN(change));
> (gdb) p change
> $1 = (ReorderBufferChange *) 0x24aaa14
>
> So the bug is in fact in David's changes, and it consists in palloc
> sometimes handing back non-maxaligned pointers. I find it mildly
> astonishing that we managed to get through core regression tests
> without such a fault surfacing, but there you have it.
>
> This machine has MAXALIGN 8 but 4-byte pointers, so there's something
> wrong with that situation.
>

I suspect it's a pre-existing bug in Slab allocator, because it does this:

#define SlabBlockGetChunk(slab, block, idx) \
((MemoryChunk *) ((char *) (block) + sizeof(SlabBlock) \
+ (idx * slab->fullChunkSize)))

and SlabBlock is only 20B, i.e. not a multiple of 8B. Which would mean
that even if we allocate block and size the chunks carefully (with all
the MAXALIGN things), we ultimately slice the block incorrectly.

This would explain the 4B difference I reported before, I think. But I'm
just as astonished we got this far in the tests - regular regression
tests don't do much logical decoding, and we only use slab for changes,
but I see the failure in 006 test in src/test/recovery, so the first
five completed fine.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-08-29 15:31:40 Re: Reducing the chunk header sizes on all memory context types
Previous Message Tom Lane 2022-08-29 15:25:57 Re: Reducing the chunk header sizes on all memory context types