s_lock() seems too aggressive for machines with many sockets

From: Jan Wieck <jan(at)wi3ck(dot)info>
To: PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: s_lock() seems too aggressive for machines with many sockets
Date: 2015-06-10 13:18:56
Message-ID: 55783940.8080302@wi3ck.info
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I think I may have found one of the problems, PostgreSQL has on machines
with many NUMA nodes. I am not yet sure what exactly happens on the NUMA
bus, but there seems to be a tipping point at which the spinlock
concurrency wreaks havoc and the performance of the database collapses.

On a machine with 8 sockets, 64 cores, Hyperthreaded 128 threads total,
a pgbench -S peaks with 50-60 clients around 85,000 TPS. The throughput
then takes a very sharp dive and reaches around 20,000 TPS at 120
clients. It never recovers from there.

The attached patch demonstrates that less aggressive spinning and (much)
more often delaying improves the performance "on this type of machine".
The 8 socket machine in question scales to over 350,000 TPS.

The patch is meant to demonstrate this effect only. It has a negative
performance impact on smaller machines and client counts < #cores, so
the real solution will probably look much different. But I thought it
would be good to share this and start the discussion about reevaluating
the spinlock code before PGCon.

Regards, Jan

--
Jan Wieck
Senior Software Engineer
http://slony.info

Attachment Content-Type Size
spins_per_delay.diff text/x-patch 1.4 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2015-06-10 13:19:29 Re: Why no jsonb_exists_path()?
Previous Message Michael Paquier 2015-06-10 12:25:40 Re: pg_archivecleanup bug (invalid filename input)