Quick Links

Re: why roll-your-own s_lock? / improving scalability

From:	Merlin Moncure <mmoncure(at)gmail(dot)com>
To:	Nils Goroll <slink(at)schokola(dot)de>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: why roll-your-own s_lock? / improving scalability
Date:	2012-06-26 18:46:06
Message-ID:	CAHyXU0zKwJGbGw_oDUKimxYZoNc7q5+fbtiOynM3Cq5ZqzKH9A@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, Jun 26, 2012 at 12:02 PM, Nils Goroll <slink(at)schokola(dot)de> wrote:
> Hi,
>
> I am currently trying to understand what looks like really bad scalability of
> 9.1.3 on a 64core 512GB RAM system: the system runs OK when at 30% usr, but only
> marginal amounts of additional load seem to push it to 70% and the application
> becomes highly unresponsive.
>
> My current understanding basically matches the issues being addressed by various
> 9.2 improvements, well summarized in
> http://wiki.postgresql.org/images/e/e8/FOSDEM2012-Multi-CPU-performance-in-9.2.pdf
>
> An additional aspect is that, in order to address the latent risk of data loss &
> corruption with WBCs and async replication, we have deliberately moved the db
> from a similar system with WB cached storage to ssd based storage without a WBC,
> which, by design, has (in the best WBC case) approx. 100x higher latencies, but
> much higher sustained throughput.
>
>
> On the new system, even with 30% user "acceptable" load, oprofile makes apparent
> significant lock contention:
>
> opreport --symbols --merge tgid -l /mnt/db1/hdd/pgsql-9.1/bin/postgres
>
>
> Profiling through timer interrupt
> samples % image name symbol name
> 30240 27.9720 postgres s_lock
> 5069 4.6888 postgres GetSnapshotData
> 3743 3.4623 postgres AllocSetAlloc
> 3167 2.9295 libc-2.12.so strcoll_l
> 2662 2.4624 postgres SearchCatCache
> 2495 2.3079 postgres hash_search_with_hash_value
> 2143 1.9823 postgres nocachegetattr
> 1860 1.7205 postgres LWLockAcquire
> 1642 1.5189 postgres base_yyparse
> 1604 1.4837 libc-2.12.so __strcmp_sse42
> 1543 1.4273 libc-2.12.so __strlen_sse42
> 1156 1.0693 libc-2.12.so memcpy
>
> Unfortunately I don't have profiling data for the high-load / contention
> condition yet, but I fear the picture will be worse and pointing in the same
> direction.
>
> <pure speculation>
> In particular, the _impression_ is that lock contention could also be related to
> I/O latencies making me fear that cases could exist where spin locks are being
> helt while blocking on IO.
> </pure speculation>
>
>
> Looking at the code, it appears to me that the roll-your-own s_lock code cannot
> handle a couple of cases, for instance it will also spin when the lock holder is
> not running at all or blocking on IO (which could even be implicit, e.g. for a
> page flush). These issues have long been addressed by adaptive mutexes and futexes.
>
> Also, the s_lock code tries to be somehow adaptive using spins_per_delay (when
> having spun for long (not not blocked), spin even longer in future), which
> appears to me to have the potential of becoming highly counter-productive.
>
>
> Now that the scene is set, here's the simple question: Why all this? Why not
> simply use posix mutexes which, on modern platforms, will map to efficient
> implementations like adaptive mutexes or futexes?

Well, that would introduce a backend dependency on pthreads, which is
unpleasant. Also you'd need to feature test via
_POSIX_THREAD_PROCESS_SHARED to make sure you can mutex between
processes (and configure your mutexes as such when you do). There are
probably other reasons why this can't be done, but I personally don' t
klnow of any.

Also, it's forbidden to do things like invoke i/o in the backend while
holding only a spinlock. As to your larger point, it's an interesting
assertion -- some data to back it up would help.

merlin

In response to

why roll-your-own s_lock? / improving scalability at 2012-06-26 17:02:31 from Nils Goroll

Responses

Re: why roll-your-own s_lock? / improving scalability at 2012-06-26 19:05:12 from Nils Goroll
Re: why roll-your-own s_lock? / improving scalability at 2012-06-26 21:44:42 from Martijn van Oosterhout

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Pavel Stehule	2012-06-26 18:47:55	Re: proof concept - access to session variables on client side
Previous Message	David Fetter	2012-06-26 18:11:54	Re: Catalog/Metadata consistency during changeset extraction from wal