Re: SSD + RAID

From: Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Brad Nicholson <bnichols(at)ca(dot)afilias(dot)info>, Karl Denninger <karl(at)denninger(dot)net>, Laszlo Nagy <gandalf(at)shopzeus(dot)com>, pgsql-performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: SSD + RAID
Date: 2009-11-30 15:48:32
Message-ID: 4B13E950.1060001@cheapcomplexdevices.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Bruce Momjian wrote:
>> For example, ext3 fsync() will issue write barrier commands
>> if the inode was modified; but not if the inode wasn't.
>>
>> See test program here:
>> http://www.mail-archive.com/linux-kernel(at)vger(dot)kernel(dot)org/msg272253.html
>> and read two paragraphs further to see how touching
>> the inode makes ext3 fsync behave differently.
>
> I thought our only problem was testing the I/O subsystem --- I never
> suspected the file system might lie too. That email indicates that a
> large percentage of our install base is running on unreliable file
> systems --- why have I not heard about this before?

It came up a on these lists a few times in the past. Here's one example.
http://archives.postgresql.org/pgsql-performance/2008-08/msg00159.php

As far as I can tell, most of the threads ended with people still
suspecting lying hard drives. But to the best of my ability I can't
find any drives that actually lie when sent the commands to flush
their caches. But various combinations of ext3 & linux MD that
decide not to send IDE FLUSH_CACHE_EXT (nor the similiar
SCSI SYNCHRONIZE CACHE command) under various situations.

I wonder if there are enough ext3 users out there that postgres should
touch the inodes before doing a fsync.

> Do the write barriers allow data loss but prevent data inconsistency?

If I understand right, data inconsistency could occur too. One
aspect of the write barriers is flushing a hard drive's caches.

> It sound like they are effectively running with synchronous_commit = off.

And with the (mythical?) hard drive with lying caches.

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Ron Mayer 2009-11-30 16:32:34 Re: SSD + RAID
Previous Message Robert Haas 2009-11-30 15:40:48 Re: query optimization