Re: Maximum transaction rate

From: Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>
To: Marco Colombo <pgsql(at)esiway(dot)net>
Cc: Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: Maximum transaction rate
Date: 2009-03-19 19:10:14
Message-ID: 49C29896.1020607@cheapcomplexdevices.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Marco Colombo wrote:
> Yes, but we knew it already, didn't we? It's always been like
> that, with IDE disks and write-back cache enabled, fsync just
> waits for the disk reporting completion and disks lie about

I've looked hard, and I have yet to see a disk that lies.

ext3, OTOH seems to lie.

IDE drives happily report whether they support write barriers
or not, which you can see with the command:
%hdparm -I /dev/hdf | grep FLUSH_CACHE_EXT
I've tested about a dozen drives, and I've never seen one
claims to support flushing that doesn't. And I haven't seen
one that doesn't support it that was made less than half a
decade ago. IIRC, ATA-5 specs from 2000 made supporting
this mandatory.

Linux kernels since 2005 or so check for this feature. It'll
happily tell you which of your devices don't support it.
%dmesg | grep 'disabling barriers'
JBD: barrier-based sync failed on md1 - disabling barriers
And for devices that do, it will happily send IDE FLUSH CACHE
commands to IDE drives that support the feature. At the same
time Linux kernels started sending the very similar. SCSI
SYNCHRONIZE CACHE commands.

> Anyway, it's the block device job to control disk caches. A
> filesystem is just a client to the block device, it posts a
> flush request, what happens depends on the block device code.
> The FS doesn't talk to disks directly. And a write barrier is
> not a flush request, is a "please do not reorder" request.
> On fsync(), ext3 issues a flush request to the block device,
> that's all it's expected to do.

But AFAICT ext3 fsync() only tell the block device to
flush disk caches if the inode was changed.

Or, at least empirically if I modify a file and do
fsync(fd); on ext3 it does not wait until the disk
spun to where it's supposed to spin. But if I put
a couple fchmod()'s right before the fsync() it does.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Roderick A. Anderson 2009-03-19 19:22:31 Determining PUBLIC's permissions
Previous Message Joshua D. Drake 2009-03-19 19:06:06 PostgreSQL technical Videos: Proteomic mining and Procedural language development