Skip site navigation (1) Skip section navigation (2)

Re: huge tlb support

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: huge tlb support
Date: 2012-07-03 11:30:35
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-hackers
On Tuesday, July 03, 2012 05:18:04 AM Tom Lane wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> > On Fri, Jun 29, 2012 at 3:52 PM, Andres Freund <andres(at)2ndquadrant(dot)com> 
> >> In a *very* quick patch I tested using huge pages/MAP_HUGETLB for the
> >> mmap'ed memory.
> > 
> > So, considering that there is required setup, it seems that the
> > obvious thing to do here is add a GUC: huge_tlb_pages (boolean).
We also need some logic to figure out how big the huge tlb size is... 
/sys/kernel/mm/hugepages/* contains a directory for each possible size. A bit 
unfortunately named though "hugepages-2048kB". We need to parse that.

> > The other alternative is to try with MAP_HUGETLB and, if it fails, try
> > again without MAP_HUGETLB.
> +1 for not making people configure this manually.
I don't think thats going to fly that well. You need to specifically allocate 
hugepages at boot or shortly thereafter. If postgres just grabs some of the 
available space without asking it very well might cause other applications not 
to be able to start. Were not allocating half of the system memory without 
asking either...

> Also, I was under the impression that recent Linux kernels use hugepages
> automatically if they can, so I wonder exactly what Andres was testing
> on ...
At the time I was running the test I was running a moderately new kernel:

andres(at)awork2:~$ uname -a
Linux awork2 3.4.3-andres #138 SMP Mon Jun 19 12:46:32 CEST 2012 x86_64 
andres(at)awork2:~$ zcat /proc/config.gz |grep HUGE

So, transparent hugepages are enabled by default.

The problem is that the kernel needs 2MB of adjacent physical memory mapping 
to 2MB of adjacent virtual memory. In on-demand, cow virtual memory systems 
that just doesn't happen all the time if youre not doing file mmap while 
triggering massive readaheads. Especially if the system has been running for 
some time because the memory just gets too fragmented to have lots of adjacent 
physical memory around.
There was/is talk about moving physical memory around to make room for more 
huge pages but thats not there yet and the patches I have seen incurred quite 
some overhead.
Btw, the introduction of transparent hugepages advocated that there are still 
benefits in manual hugepage setups.

Btw, should anybody want to test this:
After boot you can allocate huge pages with:
during runtime:
echo 3000 > /proc/sys/vm/nr_hugepages
or at boot you can add a parameter:
(allocates 6GB of huge pages on x86-64)

The runtime one might take quite a time till it has found enough pages or even 
fall short.

You can see the huge page status with:
andres(at)awork2:~$ cat /proc/meminfo |grep Huge
AnonHugePages:    591872 kB
HugePages_Total:    3000
HugePages_Free:     3000
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB



 Andres Freund	         
 PostgreSQL Development, 24x7 Support, Training & Services

In response to

pgsql-hackers by date

Next:From: Robert HaasDate: 2012-07-03 12:13:05
Subject: Re: xlog filename formatting functions in recovery
Previous:From: Amit KapilaDate: 2012-07-03 10:43:44
Subject: Re: xlog filename formatting functions in recovery

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group