Re: Add bump memory context type and use it for tuplesorts

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Add bump memory context type and use it for tuplesorts
Date: 2024-04-06 23:59:33
Message-ID: CAApHDvo7eoBeKRAqm4y6H5J6jZnZWmjcHEkDRMC2phYM3W=aHw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, 7 Apr 2024 at 05:45, Matthias van de Meent
<boekewurm+postgres(at)gmail(dot)com> wrote:
> Malloc's docs specify the minimum chunk size at 4*sizeof(void*) and itself uses , so using powers of 2 for chunks would indeed fail to detect 1s in the 4th bit. I suspect you'll get different results when you check the allocation patterns of multiples of 8 bytes, starting from 40, especially on 32-bit arm (where MALLOC_ALIGNMENT is 8 bytes, rather than the 16 bytes on i386 and 64-bit architectures, assuming [0] is accurate)

I'm prepared to be overruled, but I just don't have strong feelings
that 32-bit is worth making these reservations for. Especially so
given the rate we're filling these slots. The only system that I see
the 4th bit change is Cygwin. It doesn't look like a very easy system
to protect against pfreeing of malloc'd chunks as the prior 8-bytes
seem to vary depending on the malloc'd size and I see all bit patterns
there, including the ones we use for our memory contexts.

Since we can't protect against every possible bit pattern there, we
need to draw the line somewhere. I don't think 32-bit systems are
worth reserving these precious slots for. I'd hazard a guess that
there are more instances of Postgres running on Windows today than on
32-bit CPUs and we don't seem too worried about the bit-patterns used
for Windows.

> In your updated 0001, you don't seem to fill the RESERVED_GLIBC memctx array entries with BOGUS_MCTX().

Oops. Thanks

>> Another reason not to make it 5 bits is that I believe that would make
>> the mcxt_methods[] array 2304 bytes rather than 576 bytes. 4 bits
>> makes it 1152 bytes, if I'm counting correctly.
>
>
> I don't think I understand why this would be relevant when only 5 of the contexts are actually in use (thus in caches). Is that size concern about TLB entries then?

It's a static const array. I don't want to bloat the binary with
something we'll likely never need. If we one day need it, we can
reserve another bit using the same overlapping method.

>> I revised the patch to simplify hdrmask logic. This started with me
>> having trouble finding the best set of words to document that the
>> offset is "half the bytes between the chunk and block". So, instead
>> of doing that, I've just made it so these two fields effectively
>> overlap. The lowest bit of the block offset is the same bit as the
>> high bit of what MemoryChunkGetValue returns.
>
>
> Works for me, I suppose.

hmm. I don't detect much enthusiasm for it.

Personally, I quite like the method as it adds no extra instructions
when encoding the MemoryChunk and only a simple bitwise-AND when
decoding it. Your method added extra instructions in the encode and
decode. I went to great lengths to make this code as fast as
possible, so I know which method that I prefer. We often palloc and
never do anything that requires the chunk header to be decoded, so not
adding extra instructions on the encoding stage is a big win.

The only method I see to avoid adding instructions in encoding and
decoding is to reduce the bit-space for the MemoryChunkGetValue field
to 29 bits. Effectively, that means non-external chunks can only be
512MB rather than 1GB. As far as I know, that just limits slab.c to
only being able to do 512MB pallocs as generation.c and aset.c use
external chunks well below that threshold. Restricting slab to 512MB
is probably far from the end of the world. Anything close to that
would be a terrible use case for slab. I was just less keen on using
a bit from there as that's a field we allow the context implementation
to do what they like with. Having bitspace for 2^30 possible values in
there just seems nice given that it can store any possible value from
zero up to MaxAllocSize.

David

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2024-04-07 00:27:43 Re: BitmapHeapScan streaming read user and prelim refactoring
Previous Message Thomas Munro 2024-04-06 23:07:58 Re: Extension Enhancement: Buffer Invalidation in pg_buffercache