Re: copy.c allocation constant

From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: copy.c allocation constant
Date: 2018-01-24 19:55:49
Message-ID: 20180124195549.n5j4qxzqzf5p2g74@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2018-01-24 14:25:37 -0500, Robert Haas wrote:
> On Wed, Jan 24, 2018 at 1:43 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> > Indeed. Don't think RAW_BUF_SIZE is quite big enough for that on most
> > platforms though. From man mallopt:
> > Balancing these factors leads to a default setting of 128*1024 for the M_MMAP_THRESHOLD parameter.
> > Additionally, even when malloc() chooses to use mmap() to back an
> > allocation, it'll still needs a header to know the size of the
> > allocation and such. So exactly using a size of a multiple of 4KB will
> > still leave you with wasted space. Due to the latter I can't see it
> > mattering whether or not we add +1 to a power-of-two size.
>
> Well, it depends on how it works. dsa_allocate, for example, never
> adds a header to the size of the allocation.

glibc's malloc does add a header. My half-informed suspicion is that
most newer malloc backing allocators will have a header, because
maintaining a shared lookup-by-address table is pretty expensive to
maintain. A bit of metadata indicating size and/or source of the
allocation makes using thread-local information a lot easier.

> Allocations < 8kB are
> bucketed by size class and stored in superblocks carved up into
> equal-sized chunks. Allocations > 8kB are rounded to a multiple of
> the 4kB page size and we grab that many consecutive free pages. I
> didn't make those behaviors up; I copied them from elsewhere. Some
> other allocator I read about did small-medium-large allocations: large
> with mmap(), medium with multiples of the page size, small with
> closely-spaced size classes.

Sure - all I'm trying to say that it likely won't matter whether we use
power-of-two or power-of-two + 1, because it seems likely that due to
overhead considerations we'll likely not quite fit into a size class
anyway.

> It doesn't seem like a particularly good idea to take a 64kB+1 byte
> allocation, stick a header on it, and pack it tightly up against other
> allocations on both sides. Seems like that could lead to
> fragmentation problems. Is that really what it does?

No, I'm fairly sure it's not.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-01-24 19:57:04 Re: pgsql: Add parallel-aware hash joins.
Previous Message Peter Geoghegan 2018-01-24 19:54:49 Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)