Re: Fix performance degradation of contended LWLock on NUMA

From: Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>
To: Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>
Cc: pgsql-hackers(at)postgresql(dot)org, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Fix performance degradation of contended LWLock on NUMA
Date: 2017-09-08 15:33:35
Message-ID: be2d7380-e80d-f8f3-8eab-7088c11eb5d2@redhat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 07/18/2017 01:20 PM, Sokolov Yura wrote:
> I'm sending rebased version with couple of one-line tweaks.
> (less skip_wait_list on shared lock, and don't update spin-stat on
> aquiring)
>

I have been running this patch on a 2S/28C/56T/256Gb w/ 2 x RAID10 SSD
setup (1 to 375 clients on logged tables).

I'm seeing

-M prepared: Up to 11% improvement
-M prepared -S: No improvement, no regression ("noise")
-M prepared -N: Up to 12% improvement

for all runs the improvement shows up the closer you get to the number
of CPU threads, or above. Although I'm not seeing the same improvements
as you on very large client counts there are definitely improvements :)

Some comments:
==============

lwlock.c:
---------
+ * This race is avoiding by taking lock for wait list using CAS with a old
+ * state value, so it could succeed only if lock is still held.
Necessary flag
+ * is set with same CAS.
+ *
+ * Inside LWLockConflictsWithVar wait list lock is reused to protect
variable
+ * value. So first it is acquired to check variable value, then flags are
+ * set with second CAS before queueing to wait list to ensure lock were not
+ * released yet.

* This race is avoided by taking a lock for the wait list using CAS
with the old
* state value, so it would only succeed if lock is still held.
Necessary flag
* is set using the same CAS.
*
* Inside LWLockConflictsWithVar the wait list lock is reused to
protect the variable
* value. First the lock is acquired to check the variable value, then
flags are
* set with a second CAS before queuing to the wait list in order to
ensure that the lock was not
* released yet.

+static void
+add_lwlock_stats_spin_stat(LWLock *lock, SpinDelayStatus *delayStatus)

Add a method description.

+ /*
+ * This barrier is never needed for correctness, and it is no-op
+ * on x86. But probably first iteration of cas loop in
+ * ProcArrayGroupClearXid will succeed oftener with it.
+ */

* "more often"

+static inline bool
+LWLockAttemptLockOrQueueSelf(LWLock *lock, LWLockMode mode, LWLockMode
waitmode)

I'll leave it to the Committer to decide if this method is too big to be
"inline".

+ /*
+ * We intentionally do not call finish_spin_delay here, cause loop above
+ * usually finished in queueing into wait list on contention, and doesn't
+ * reach spins_per_delay (and so, doesn't sleep inside of
+ * perform_spin_delay). Also, different LWLocks has very different
+ * contention pattern, and it is wrong to update spin-lock statistic based
+ * on LWLock contention.
+ */

/*
* We intentionally do not call finish_spin_delay here, because the
loop above
* usually finished by queuing into the wait list on contention, and
doesn't
* reach spins_per_delay thereby doesn't sleep inside of
* perform_spin_delay. Also, different LWLocks has very different
* contention pattern, and it is wrong to update spin-lock statistic based
* on LWLock contention.
*/

s_lock.c:
---------
+ if (status->spins == 0)
+ /* but we didn't spin either, so ignore */
+ return;

Use { } for the if, or move the comment out of the nesting for readability.

Open questions:
---------------
* spins_per_delay as extern
* Calculation of skip_wait_list

You could run the patch through pgindent too.

Passes make check-world.

Status: Waiting on Author

Best regards,
Jesper

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2017-09-08 15:33:39 Re: More flexible LDAP auth search filters?
Previous Message Tom Lane 2017-09-08 15:28:06 pgsql: Fix assorted portability issues in new pgbench TAP tests.