Re: Vacuum: allow usage of more than 1GB of work mem

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Claudio Freire <klaussfreire(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, Anastasia Lubennikova <lubennikovaav(at)gmail(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Vacuum: allow usage of more than 1GB of work mem
Date: 2017-04-11 18:53:14
Message-ID: CA+TgmoaK7aWO=-ewE15Ny7X=eRy0YJC9dkt3Yh08p_2=8rpXKA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Apr 7, 2017 at 9:12 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
>> Why do you say exponential growth fragments memory? AFAIK, all those
>> allocations are well beyond the point where malloc starts mmaping
>> memory, so each of those segments should be a mmap segment,
>> independently freeable.
>
> Not all platforms have that, and even on platforms with it, frequent,
> unevenly sized, very large allocations can lead to enough fragmentation
> that further allocations are harder and fragment / enlarge the
> pagetable.

Such a thing is completely outside my personal experience. I've never
heard of a case where a 64-bit platform fails to allocate memory
because something (what?) is fragmented. Page table memory usage is a
concern at some level, but probably less so for autovacuum workers
than for most backends, because autovacuum workers (where most
vacuuming is done) exit after one pass through pg_class. Although I
think our memory footprint is a topic that could use more energy, I
don't really see any reason to think that pagetable bloat caused my
unevenly sized allocations in short-lived processes is the place to
start worrying.

That having been said, IIRC, I did propose quite a ways upthread that
we use a fixed chunk size, just because it would use less actual
memory, never mind the size of the page table. I mean, if you
allocate in chunks of 64MB, which I think is what I proposed, you'll
never waste more than 64MB. If you allocate in
exponentially-increasing chunk sizes starting at 128MB, you could
easily waste much more. Let's imagine a 1TB table where 20% of the
tuples are dead due to some large bulk operation (a bulk load failed,
or a bulk delete succeeded, or a bulk update happened). Back of the
envelope calculation:

1TB / 8kB per page * 60 tuples/page * 20% * 6 bytes/tuple = 9216MB of
maintenance_work_mem

So we'll allocate 128MB+256MB+512MB+1GB+2GB+4GB which won't be quite
enough so we'll allocate another 8GB, for a total of 16256MB, but more
than three-quarters of that last allocation ends up being wasted.
I've been told on this list before that doubling is the one true way
of increasing the size of an allocated chunk of memory, but I'm still
a bit unconvinced.

On the other hand, if we did allocate fixed chunks of, say, 64MB, we
could end up with an awful lot of them. For example, in the example
above, 9216MB/64MB = 144 chunks. Is that number of mappings going to
make the VM system unhappy on any of the platforms we care about? Is
that a bigger or smaller problem than what you (Andres) are worrying
about? I don't know.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-04-11 18:53:54 Re: bumping HASH_VERSION to 3
Previous Message Tom Lane 2017-04-11 18:40:23 Why does logical replication launcher set application_name?