Re: SSD + RAID

From: Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Brad Nicholson <bnichols(at)ca(dot)afilias(dot)info>, Karl Denninger <karl(at)denninger(dot)net>, Laszlo Nagy <gandalf(at)shopzeus(dot)com>, pgsql-performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: SSD + RAID
Date: 2009-11-30 16:32:34
Message-ID: 4B13F3A2.6040300@cheapcomplexdevices.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Bruce Momjian wrote:
> Greg Smith wrote:
>> Bruce Momjian wrote:
>>> I thought our only problem was testing the I/O subsystem --- I never
>>> suspected the file system might lie too. That email indicates that a
>>> large percentage of our install base is running on unreliable file
>>> systems --- why have I not heard about this before?
>>>
>> he reason why it
>> doesn't bite more people is that most Linux systems don't turn on write
>> barrier support by default, and there's a number of situations that can
>> disable barriers even if you did try to enable them. It's still pretty
>> unusual to have a working system with barriers turned on nowadays; I
>> really doubt it's "a large percentage of our install base".
>
> Ah, so it is only when write barriers are enabled, and they are not
> enabled by default --- OK, that makes sense.

The test program I linked up-thread shows that fsync does nothing
unless the inode's touched on an out-of-the-box Ubuntu 9.10 using
ext3 on a straight from Dell system.

Surely that's a common config, no?

If I uncomment the fchmod lines below I can see that even with ext3
and write caches enabled on my drives it does indeed wait.
Note that EXT4 doesn't show the problem on the same system.

Here's a slightly modified test program that's a bit easier to run.
If you run the program and it exits right away, your system isn't
waiting for platters to spin.

////////////////////////////////////////////////////////////////////
/*
** based on http://article.gmane.org/gmane.linux.file-systems/21373
** http://thread.gmane.org/gmane.linux.kernel/646040
** If this program returns instantly, the fsync() lied.
** If it takes a second or so, fsync() probably works.
** On ext3 and drives that cache writes, you probably need
** to uncomment the fchmod's to make fsync work right.
*/
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>

int main(int argc,char *argv[]) {
if (argc<2) {
printf("usage: fs <filename>\n");
exit(1);
}
int fd = open (argv[1], O_RDWR | O_CREAT | O_TRUNC, 0666);
int i;
for (i=0;i<100;i++) {
char byte;
pwrite (fd, &byte, 1, 0);
// fchmod (fd, 0644); fchmod (fd, 0664);
fsync (fd);
}
}
////////////////////////////////////////////////////////////////////
ron(at)ron-desktop:/tmp$ /usr/bin/time ./a.out foo
0.00user 0.00system 0:00.01elapsed 21%CPU (0avgtext+0avgdata 0maxresident)k

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Bruce Momjian 2009-11-30 16:40:11 Re: SSD + RAID
Previous Message Ron Mayer 2009-11-30 15:48:32 Re: SSD + RAID