Re: Shared Memory: How to use SYSV rather than MMAP ?

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: tony(dot)reix(at)atos(dot)net, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, sylvie(dot)empereur-mot(at)atos(dot)net
Subject: Re: Shared Memory: How to use SYSV rather than MMAP ?
Date: 2018-11-20 21:54:26
Message-ID: CAEepm=3j7b8H3Rd-e=Wcf-Qsm6hF8ZSEX_Z4dHiAhZhKJGOEQA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Nov 21, 2018 at 9:07 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2018-11-21 09:00:58 +1300, Thomas Munro wrote:
> > On Wed, Nov 21, 2018 at 4:37 AM REIX, Tony <tony(dot)reix(at)atos(dot)net> wrote:
> > > YES ! Reading this file, your suggestion should work ! Thx !
> > >
> > > I've rebuilt and run the basic tests. We'll relaunch our tests asap.
> >
> > I would be surprised if that makes a difference:
> > anonymous-mmap-then-fork and SysV shm are just two different ways to
> > exchange mappings between processes, but I'd expect the virtual memory
> > object itself to be basically the same, in terms of constraints that
> > might affect page size at least.
>
> I don't think that's true on many systems, FWIW. On linux there's
> certainly different behaviour, and e.g. the way to get hugepages for
> anon-mmap and SysV shmem aren't the same.

Right, when asking for them explicitly the API is different (SHM_HUGE
flag to shmget(), MAP_HUGETLB flag to mmap()). Actually I was
expecting AIX to be more like FreeBSD and Solaris, where you don't do
that, the OS just decides what page size to give you, but after some
quality time with google I now see that it's more like Linux in the
SysV case... there is an explicit flag:

https://www.ibm.com/support/knowledgecenter/en/ssw_aix_71/com.ibm.aix.performance/large_pages_shared_mem_segs.htm

You also need some special privileges:

https://www.ibm.com/support/knowledgecenter/en/ssw_aix_71/com.ibm.aix.performance/large_page_ovw.htm

As for the main shared buffers area using anon-mmap, I wonder if it
would automagically use large pages if you have the privileges and set
the LDR_CNTRL environment variable (or the equivalent XCOFF header for
the binary):

https://www.ibm.com/support/knowledgecenter/en/ssw_aix_71/com.ibm.aix.performance/set_env_variable_lpages.htm

> [1] strongly suggests that
> that's not the case on FreeBSD either (with sysv shmem being
> better). I'd attached a patch to implement a GUC to allow users to
> choose the shmem implementation back then [2].

Surprising. I'd like to know if that's still true. SysV shm is not
nice, and if there is anything accidentally better about its
performance, I'd love to know what. That report (slightly) predates
this work (maybe causally linked), which fixed various VM scale
problems hit by PostgreSQL:
http://www.kib.kiev.ua/kib/pgsql_perf_v2.0.pdf

--
Thomas Munro
http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2018-11-20 22:43:19 Re: pgbench - doCustom cleanup
Previous Message Peter Eisentraut 2018-11-20 21:41:51 Re: pg_stat_ssl additions