Skip site navigation (1) Skip section navigation (2)

ext4 finally doing the right thing

From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: ext4 finally doing the right thing
Date: 2010-01-16 03:05:49
Message-ID: 4B512D0D.4030909@2ndquadrant.com (view raw or flat)
Thread:
Lists: pgsql-performance
A few months ago the worst of the bugs in the ext4 fsync code started 
clearing up, with 
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=5f3481e9a80c240f169b36ea886e2325b9aeb745 
as a particularly painful one.  That made it into the 2.6.32 kernel 
released last month.  Some interesting benchmark news today suggests a 
version of ext4 that might actually work for databases is showing up in 
early packaged distributions:

http://www.phoronix.com/scan.php?page=article&item=ubuntu_lucid_alpha2&num=3

Along with the massive performance drop that comes from working fsync.  
See 
http://www.phoronix.com/scan.php?page=article&item=linux_perf_regressions&num=2 
for background about this topic from when the issue was discovered:

"[This change] is required for safe behavior with volatile write caches 
on drives.  You could mount with -o nobarrier and [the performance drop] 
would go away, but a sequence like write->fsync->lose power->reboot may 
well find your file without the data that you synced, if the drive had 
write caches enabled.  If you know you have no write cache, or that it 
is safely battery backed, then you can mount with -o nobarrier, and not 
incur this penalty."

The pgbench TPS figure Phoronix has been reporting has always been a 
fictitious one resulting from unsafe write caching.  With 2.6.32 
released with ext4 defaulting to proper behavior on fsync, that's going 
to make for a very interesting change.  On one side, we might finally be 
able to use regular drives with their caches turned on safely, taking 
advantage of the cache for other writes while doing the right thing with 
the database writes.  On the other, anyone who believed the fictitious 
numbers before is going to be in a rude surprise and think there's a 
massive regression here.  There's some potential for this to show 
PostgreSQL in a bad light, when people discover they really only can get 
~100 commits/second out of cheap hard drives and assume the database is 
to blame.  Interesting times.

-- 
Greg Smith    2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com  www.2ndQuadrant.co


Responses

pgsql-performance by date

Next:From: Craig RingerDate: 2010-01-16 03:59:57
Subject: Re: a heavy duty operation on an "unused" table kills my server
Previous:From: Greg SmithDate: 2010-01-16 02:25:43
Subject: Re: a heavy duty operation on an "unused" table kills my server

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group