From: | Andrea Suisani <sickpig(at)opinioni(dot)net> |
---|---|
To: | Kevin Grittner <kgrittn(at)gmail(dot)com> |
Cc: | "<pgsql-hackers(at)postgresql(dot)org>" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: OS scheduler bugs affecting high-concurrency contention |
Date: | 2016-04-19 09:25:48 |
Message-ID: | 5715F99C.4080000@opinioni.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 04/16/2016 04:15 PM, Kevin Grittner wrote:
> There is a paper that any one interested in performance at high
> concurrency, especially in Linux, should read[1]. While doing
> other work, a group of researchers saw behavior that they suspected
> was due to scheduler bugs in Linux. There were no tools that made
> proving that practical, so they developed such a tool set and used
> it to find four bugs in the Linux kernel which were introduced in
> these releases, have not yet been fixed, and have this following
> maximum impact when running NAS benchmarks, based on running with
> and without the researchers' fixes for the bugs:
>
> 2.6.32: 22%
> 2,6.38: 13x
> 3.9: 27x
> 3.19: 138x
>
> That's right -- one of these OS scheduler bugs in production
> versions of Linux can make one of NASA's benchmarks run for 138
> times as long as it does without the bug. I don't feel that I can
> interpret the results of any high-concurrency benchmarks in a
> meaningful way without knowing which of these bugs were present in
> the OS used for the benchmark. Just as an example, it is helpful
> to know that the benchmarks Andres presented were run on 3.16, so
> it would have three of these OS bugs affecting results, but not the
> most severe one. I encourage you to read the paper an draw your
> own conclusions.
>
> Anyway, please don't confuse this thread with the one on the
> "snapshot too old" patch -- I am still working on that and will
> post results there when they are ready.
Thanks for the link, appreciated.
On slightly related topic, Jens Axboe proposed a patchset [1]
to improve the performance of background buffered writeback.
On Lwn.net an article about the issue at hand has been recently published [2].
Maybe this work could somewhat solve the problem experienced by PostgreSQL users
while checkpoint process flushes all pending changes to disk and recycles the
transaction logs.
--
Andrea Suisani
suisani(at)opinioni(dot)net
Demetra opinioni.net srl
[1] "[PATCHSET v3][RFC] Make background writeback not suck"
http://thread.gmane.org/gmane.linux.kernel/2186732
[2] "Toward less-annoying background writeback"
https://lwn.net/SubscriberLink/682582/93d9e5b6bed03a32/
From | Date | Subject | |
---|---|---|---|
Next Message | Aleksander Alekseev | 2016-04-19 09:52:38 | Re: Coverage report |
Previous Message | Aleksander Alekseev | 2016-04-19 09:20:03 | Re: Parser extensions (maybe for 10?) |