Skylake-S warning

From: Daniel Wood <hexexpert(at)comcast(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Skylake-S warning
Date: 2018-10-03 21:29:39
Message-ID: 802677091.158786.1538602180341@connect.xfinity.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

If running benchmarks or you are a customer which is currently impacted by GetSnapshotData() on high end multisocket systems be wary of Skylake-S.

Performance differences of nearly 2X can be seen on select only pgbench due to nothing else but unlucky choices for max_connections. Scale 1000, 192 local clients on a 2 socket 48 core Skylake-S(Xeon Platinum 8175M @ 2.50-GHz) system. pgbench -S

Results from 5 runs varying max_connections from 400 to 405:

max

conn TPS

400 677639

401 1146776

402 1122140

403 765664

404 671455

405 1190277

...

perf top shows about 21% GetSnapshotData() with the good numbers and 48% with the bad numbers.

This problem is not seen on a 2 socket 32 core Haswell system. Being a one man show I lack some of the diagnostic tools to drill down further. My suspicion is that the fact that Intel has lowered the L2 associativity from 8(Haswell) to 4(Skylake-S) may be the cause. The other possibility is that at higher core counts the shared 16-way inclusive associative L3 cache becomes insufficient. Perhaps that is why Intel has moved to an exclusive L3 cache on Skylake-SP.

If this is indeed just disadvantageous placement of structures/arrays in memory then you might also find that after upgrading a previous good choice for max_connections becomes a bad choice if things move around.

NOTE: int pgprocno = pgprocnos[index];

is where the big increase in time occurs in GetSnapshotData()

This is largely read-only, once all connections are established, and easily fits in the L1, and is not next to anything else causing invalidations.

NOTE2: It is unclear why PG needs to support over 64K sessions. At about 10MB per backend(at the low end) the empty backends alone would consume 640GB's of memory! Changing pgprocnos from int to short gives me the following results.

max

conn TPS

400 780119

401 1129286

402 1263093

403 887021

404 679891

405 1218118

While this change is significant on large Skylake systems it is likely just a trivial improvement on other systems or workloads.

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2018-10-03 22:37:25 DROP DATABASE doesn't force other backends to close FDs
Previous Message Tom Lane 2018-10-03 20:16:11 Re: executor relation handling