Re: BUG #13493: pl/pgsql doesn't scale with cpus (PG9.3, 9.4)

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: grb(at)skogoglandskap(dot)no, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #13493: pl/pgsql doesn't scale with cpus (PG9.3, 9.4)
Date: 2015-07-08 14:22:41
Message-ID: 20150708142241.GQ10242@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 2015-07-08 09:56:51 -0400, Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > So there's an interesting "dip" between 4 and 8 clients. A perf profile
> > doesn't show any actual lock contention on master. Not that surprising,
> > there shouldn't be any exclusive locks here.
>
> What size of machine are you testing on?

2xE5520 (=> 2 x 4 sockets, 8 threads); numa.

(note that I intentionally did not fix the volatility of the function)

> I ran Graeme's tests on a 2-socket, 4-core-per-socket, no-hyperthreading
> machine, which has separate NUMA zones for the 2 sockets. What I saw
> (after fixing the "stable" issue) was that all the 8-client and 16-client
> cases were about 8x faster than 1-client, and 2-client was generally
> within hailing distance of 2x faster, but 4-client was often noticeably
> worse than the expected 4x faster.

> I figured this was likely some weird NUMA effect, possibly compounded
> by brutally stupid scheduling on the part of my kernel. But I didn't
> have time to look closer.
>
> You might be seeing the same kind of effect, or something different.
> It's hard to tell without knowing more about your machine.

I think it's likely to be some scheduler effect. The number of cpu
migrations between 4 and 8 is very different:

4:

64,599 context-switches # 0.003 M/sec (100.00%)
172 cpu-migrations # 0.007 K/sec (100.00%)
537 page-faults # 0.023 K/sec
8:
381,383 context-switches # 0.002 M/sec (100.00%)
1,279 cpu-migrations # 0.008 K/sec (100.00%)
3,869 page-faults # 0.024 K/sec
16:

514,426 context-switches # 0.003 M/sec (100.00%)
1,166 cpu-migrations # 0.007 K/sec (100.00%)
6,308 page-faults # 0.039 K/sec

There's a pretty large increase in the number of migrations between 4
and 8, but none between 8 and 16.

My guess is that the kernel tries to move around processes to idle nodes
too aggressively.

second-by-second pgbench is quite interesting:
progress: 1.0 s, 22915.3 tps, lat 0.346 ms stddev 0.078
progress: 2.0 s, 15596.8 tps, lat 0.512 ms stddev 0.185
progress: 3.0 s, 15519.2 tps, lat 0.514 ms stddev 0.499
progress: 4.0 s, 15535.7 tps, lat 0.512 ms stddev 0.306
progress: 5.0 s, 15494.3 tps, lat 0.515 ms stddev 0.162

so at -j8 we're routinely much faster than later.

Comparing perf stat pgbench -j8 -T 1 and -T 8:
-T 1
46 cpu-migrations
-T 8
534 cpu-migrations
so indeed the number of migration rises noticeably after the first
second...

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Graeme 2015-07-08 14:33:25 Re: BUG #13493: pl/pgsql doesn't scale with cpus (PG9.3, 9.4)
Previous Message Tom Lane 2015-07-08 13:56:51 Re: BUG #13493: pl/pgsql doesn't scale with cpus (PG9.3, 9.4)