Re: kernel version impact on PostgreSQL performance

From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: rodger(at)diaspora(dot)gen(dot)nz
Cc: Cyril Scetbon <cyril(dot)scetbon(at)free(dot)fr>, pgsql-general(at)postgresql(dot)org
Subject: Re: kernel version impact on PostgreSQL performance
Date: 2010-03-09 16:58:58
Message-ID: 4B967E52.8020504@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Rodger Donaldson wrote:
> Cyril Scetbon wrote:
>> Does anyone know what can be the differences between linux kernels
>> 2.6.29 and 2.6.30 that can cause this big difference (TPS x 7 !)
>> http://www.phoronix.com/scan.php?page=article&item=linux_2624_2633&num=2
>
> http://www.csamuel.org/2009/04/11/default-ext3-mode-changing-in-2630

Yeah, I realized I answered the wrong question--Cyril wanted to know
"why was 2.6.30 so much faster?", not "why did 2.6.33 get so much
slower?", which is what I was focusing on. There's a good intro to what
happened to speed up 2.6.30 at http://lwn.net/Articles/328363/ , with
the short version being "the kernel stopped caring about data integrity
at all in 2.6.30 by switching to writeback as its default".

The give you an idea how wacky this is, less than a year ago Linus
himself was ranting about how terrible that specific implementation
was: http://lkml.org/lkml/2009/3/24/415
http://lkml.org/lkml/2009/3/24/460 and making it the default exposes a
regression to bad behavior to everyone who upgrades to a newer kernel.

I'm just patiently waiting for Chris Mason (who works for Oracle--they
care about doing the right thing here too) to replace Ted Tso as the
person driving filesystem development in Linux land. That his
"data=guarded" implementation was only partially merged into 2.6.30, and
instead combined with this awful default change, speaks volumes about
how far the Linux development priorities are out of sync (pun intended)
with what database users expect. See
http://www.h-online.com/open/news/item/Kernel-Log-What-s-coming-in-2-6-30-File-systems-New-and-revamped-file-systems-741319.html
for a summary on how that drama played out. I let out a howling laugh
when reading this was because "The rest have been put on hold, with the
development cycle already entering the stabilisation phase." Linux
kernel development hasn't had a stabilization phase in years.

It's interesting that we have pgbench available as a lens to watch all
this through, because in its TPC-B-like default mode it has an
interesting property: if performance on regular hardware gets too fast,
it means data integrity must be broken, because regular drives can't do
physical commits very often. What Phoronix should be doing is testing
simple fsync rate using something like sysbench first[1], and if those
numbers come back higher than disk RPM rate declare the combination
unusable for PostgreSQL purposes rather than reporting on the fake numbers.

[1]
http://www.westnet.com/~gsmith/content/postgresql/pg-benchmarking.pdf ,
page 26

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com www.2ndQuadrant.us

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2010-03-09 17:21:57 Re: \copy command: how to define a tab character as the delimiter
Previous Message Dan Fitzpatrick 2010-03-09 16:31:10 Update view/table rule order of operations or race condition