Re: [HACKERS] HASH_CHUNK_SIZE vs malloc rounding

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] HASH_CHUNK_SIZE vs malloc rounding
Date: 2017-11-24 01:47:48
Message-ID: CAEepm=2Q1LxZiV1NkPgZ0Cx1xuX1U5hTk46=yJt-VMwydiVbqQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 29, 2016 at 6:27 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> writes:
>> I bet other allocators also do badly with "32KB plus a smidgen". To
>> minimise overhead we'd probably need to try to arrange for exactly
>> 32KB (or some other power of 2 or at least factor of common page/chunk
>> size?) to arrive into malloc, which means accounting for both
>> nodeHash.c's header and aset.c's headers in nodeHash.c, which seems a
>> bit horrible. It may not be worth doing anything about.
>
> Yeah, the other problem is that without a lot more knowledge of the
> specific allocator, we shouldn't really assume that it's good or bad with
> an exact-power-of-2 request --- it might well have its own overhead.
> It is an issue though, and not only in nodeHash.c. I'm pretty sure that
> StringInfo also makes exact-power-of-2 requests for no essential reason,
> and there are probably many other places.
>
> We could imagine providing an mmgr API function along the lines of "adjust
> this request size to the nearest thing that can be allocated efficiently".
> That would avoid the need for callers to know about aset.c overhead
> explicitly. I'm not sure how it could deal with platform-specific malloc
> vagaries though :-(

Someone pointed out to me off-list that jemalloc's next size class
after 32KB is in fact 40KB by default[1]. So PostgreSQL uses 25% more
memory for hash joins than it thinks it does on some platforms. Ouch.

It doesn't seem that crazy to expose aset.c's overhead size to client
code does it? Most client code wouldn't care, but things that are
doing something closer to memory allocator work themselves like
dense_alloc could care. It could deal with its own overhead itself,
and subtract aset.c's overhead using a macro.

[1] https://www.freebsd.org/cgi/man.cgi?jemalloc(3)

--
Thomas Munro
http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2017-11-24 01:57:17 Re: Failed to delete old ReorderBuffer spilled files
Previous Message Andres Freund 2017-11-24 01:20:32 Re: Combine function returning NULL unhandled?