Re: patch: add MAP_HUGETLB to mmap() where supported (WIP)

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Richard Poole <richard(at)2ndQuadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: patch: add MAP_HUGETLB to mmap() where supported (WIP)
Date: 2013-09-16 10:15:38
Message-ID: 20130916101538.GK1330627@alap2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2013-09-16 11:15:28 +0300, Heikki Linnakangas wrote:
> On 14.09.2013 02:41, Richard Poole wrote:
> >The attached patch adds the MAP_HUGETLB flag to mmap() for shared memory
> >on systems that support it. It's based on Christian Kruse's patch from
> >last year, incorporating suggestions from Andres Freund.
>
> I don't understand the logic in figuring out the pagesize, and the smallest
> supported hugepage size. First of all, even without the patch, why do we
> round up the size passed to mmap() to the _SC_PAGE_SIZE? Surely the kernel
> will round up the request all by itself. The mmap() man page doesn't say
> anything about length having to be a multiple of pages size.

I think it does:
EINVAL We don't like addr, length, or offset (e.g., they are too
large, or not aligned on a page boundary).
and
A file is mapped in multiples of the page size. For a file that is not a multiple
of the page size, the remaining memory is zeroed when mapped, and writes to that
region are not written out to the file. The effect of changing the size of the
underlying file of a mapping on the pages that correspond to added or removed
regions of the file is unspecified.

And no, according to my past experience, the kernel does *not* do any
such rounding up. It will just fail.

> And with the patch, why do you bother detecting the minimum supported
> hugepage size? Surely the kernel will choose the appropriate hugepage size
> just fine on its own, no?

It will fail if it's not a multiple.

> >It is still WIP as there are a couple of points that Andres has pointed
> >out to me that haven't been addressed yet;
>
> Which points are those?

I don't know which point Richard already has fixed, so I'll let him
comment on that.

> I wonder if it would be better to allow setting huge_tlb_pages=try even on
> platforms that don't have hugepages. It would simply mean the same as 'off'
> on such platforms.

I wouldn't argue against that.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Chris Travers 2013-09-16 10:19:19 Re: Minmax indexes
Previous Message Heikki Linnakangas 2013-09-16 10:03:57 Re: Minmax indexes