Re: Vacuum: allow usage of more than 1GB of work mem

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Claudio Freire <klaussfreire(at)gmail(dot)com>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Vacuum: allow usage of more than 1GB of work mem
Date: 2018-07-16 15:47:45
Message-ID: 8e5cbf08-5dd8-466d-9271-562fc65f133f@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 16/07/18 18:35, Claudio Freire wrote:
> On Mon, Jul 16, 2018 at 11:34 AM Claudio Freire <klaussfreire(at)gmail(dot)com> wrote:
>> On Fri, Jul 13, 2018 at 5:43 PM Andrew Dunstan
>> <andrew(dot)dunstan(at)2ndquadrant(dot)com> wrote:
>>> On 07/13/2018 09:44 AM, Heikki Linnakangas wrote:
>>>> Claudio raised a good point, that doing small pallocs leads to
>>>> fragmentation, and in particular, it might mean that we can't give
>>>> back the memory to the OS. The default glibc malloc() implementation
>>>> has a threshold of 4 or 32 MB or something like that - allocations
>>>> larger than the threshold are mmap()'d, and can always be returned to
>>>> the OS. I think a simple solution to that is to allocate larger
>>>> chunks, something like 32-64 MB at a time, and carve out the
>>>> allocations for the nodes from those chunks. That's pretty
>>>> straightforward, because we don't need to worry about freeing the
>>>> nodes in retail. Keep track of the current half-filled chunk, and
>>>> allocate a new one when it fills up.
>>>
>>> Google seems to suggest the default threshold is much lower, like 128K.
>>> Still, making larger allocations seems sensible. Are you going to work
>>> on that?
>>
>> Below a few MB the threshold is dynamic, and if a block bigger than
>> 128K but smaller than the higher threshold (32-64MB IIRC) is freed,
>> the dynamic threshold is set to the size of the freed block.
>>
>> See M_MMAP_MAX and M_MMAP_THRESHOLD in the man page for mallopt[1]
>>
>> So I'd suggest allocating blocks bigger than M_MMAP_MAX.
>>
>> [1] http://man7.org/linux/man-pages/man3/mallopt.3.html
>
> Sorry, substitute M_MMAP_MAX with DEFAULT_MMAP_THRESHOLD_MAX, the
> former is something else.

Yeah, we basically want to be well above whatever the threshold is. I
don't think we should try to check for any specific constant, just make
it large enough. Different libc implementations might have different
policies, too. There's little harm in overshooting, and making e.g. 64
MB allocations when 1 MB would've been enough to trigger the mmap()
behavior. It's going to be more granular than the current situation,
anyway, where we do a single massive allocation.

(A code comment to briefly mention the thresholds on common platforms
would be good, though).

- Heikki

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Fetter 2018-07-16 15:59:34 Re: Make foo=null a warning by default.
Previous Message Tom Lane 2018-07-16 15:37:28 Re: Make foo=null a warning by default.