RE: Shared Memory: How to use SYSV rather than MMAP ?

From: "REIX, Tony" <tony(dot)reix(at)atos(dot)net>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, "EMPEREUR-MOT, SYLVIE" <sylvie(dot)empereur-mot(at)atos(dot)net>, "BERGAMINI, DAMIEN" <damien(dot)bergamini(at)atos(dot)net>
Subject: RE: Shared Memory: How to use SYSV rather than MMAP ?
Date: 2018-11-26 11:05:13
Message-ID: HE1PR0202MB28123BADEDBA7E9005B7070E86D70@HE1PR0202MB2812.eurprd02.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Thomas,

About reliability, I've compiled/tested with GCC/XLCC on 2 machines in order to check that my patches are OK (no impact to PostgreSQL tests, OK both with GCC & XLC).

We do not have yet performance comparison between GCC & XLC since, though we experimented with both, we moved from v11beta1 to beta4 to 11.0 and now with 11.1 . We'll do asap.

About performance, we have deeply compared MMAP (4KB) vs SysV (64KB) Shared Memory, for dynamic and main shared memory segments, with the SAME exact HW + SW environment, using XLC -O2 + tune=pwr9.

We have not yet experimented with Large Pages (16MB), however the flags added to the 3rd parameter of shmget() are said to have no impact to performance unless Large Pages are really used.

Same with Huge Pages (16GB). We'll study this later.

So, the +37% (maximum value seen. +29% in average) improvement is the result of the single change: MMAP 4K to SysV 64K.

(this improvement is due to 2 things: mmap on AIX has perf drawbacks vs Sys V ShMem, and 64K vs 4K).

That's for 64bit only, on AIX 7.2 only. About 32bit, we do not have done measures.

We'll have to discuss in more depth your last paragraph how to handle this not only for AIX in PostgreSQL code.

Regards,

Cordialement,

Tony Reix

tony(dot)reix(at)atos(dot)net

ATOS / Bull SAS
ATOS Expert
IBM Coop Architect & Technical Leader
Office : +33 (0) 4 76 29 72 67
1 rue de Provence - 38432 Échirolles - France
www.atos.net<https://mail.ad.bull.net/owa/redir.aspx?C=PvphmPvCZkGrAgHVnWGsdMcDKgzl_dEIsM6rX0g4u4v8V81YffzBGkWrtQeAXNovd3ttkJL8JIc.&URL=http%3a%2f%2fwww.atos.net%2f>
________________________________
De : Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Envoyé : vendredi 23 novembre 2018 22:07:23
À : REIX, Tony
Cc : Andres Freund; Robert Haas; Pg Hackers; EMPEREUR-MOT, SYLVIE; BERGAMINI, DAMIEN
Objet : Re: Shared Memory: How to use SYSV rather than MMAP ?

On Sat, Nov 24, 2018 at 4:54 AM REIX, Tony <tony(dot)reix(at)atos(dot)net> wrote:
> Here is a patch for enabling SystemV Shared Memory on AIX, for 64K or bigger page size, rather than using MMAP shared memory, which is slower on AIX.

> We have tested this code with 64K pages and pgbench, on AIX 7.2 TL2 Power 9, and it provided a maximum of +37% improvement.

You also mentioned changing from XLC to GCC. Did you test the various
changes in isolation? XLC->GCC, mmap->shmget, with/without
SHM_LGPAGE. 37% is a bigger performance change than I expected from
large pages, since reports from other architectures are single-digit
percentage increases with pgbench -S.

If just changing to GCC gives you a big speed-up, it could of course
just be different/better code generation (though that'd be a bit sad
for XLC), but I also wonder if the different atomics support in our
tree could be implicated.

> We'll test this code with Large Pages (SHM_LGPAGE | SHM_PIN | S_IRUSR | S_IWUSR flags of shmget() ) ASAP.
>
>
> However, I wanted first to get your comments about this change in order to improve it for acceptance.

I think we should respect the huge_pages GUC, as we do on Linux and
Windows (since there are downsides to using large pages, maybe not
everyone would want that). It could even be useful to allow different
page sizes to be requested by GUC (I see that DB2 has an option to use
16GB pages -- yikes). It also seems like a good idea to have a
shared_memory_type GUC as Andres proposed (see his link), instead of
using a compile time option. I guess it was made a compile time
option because nobody could imagine wanting to go back to SysV shm!
(I'm still kinda surprised that MAP_ANONYMOUS memory can't be coaxed
into large pages by environment variables or loader controls, since
apparently other things like data segments etc apparently can, though
I can't find any text that says that's the case and I have no AIX
system).

--
Thomas Munro
https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.enterprisedb.com&amp;data=01%7C01%7Ctony.reix%40atos.net%7C1e06667e1d304905267c08d65187c41e%7C33440fc6b7c7412cbb730e70b0198d5a%7C0&amp;sdata=%2Feor3O4UXCcXlLrJWXQS8HWpfa77b86HCYQ3Ot24Vzk%3D&amp;reserved=0

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Verite 2018-11-26 11:40:31 Re: csv format for psql
Previous Message John Naylor 2018-11-26 10:16:50 Re: WIP: Avoid creation of the free space map for small tables