Re: Fwd: Apple Darwin disabled fsync?

From: Peter Bierman <bierman(at)apple(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Fwd: Apple Darwin disabled fsync?
Date: 2005-02-21 02:12:25
Message-ID: a06010200be3eebfde545@[17.202.21.231]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At 12:38 AM -0500 2/20/05, Tom Lane wrote:
>Dominic Giampaolo <dbg(at)apple(dot)com> writes:
>>> I believe that what the above comment refers to is the fact that
>>> fsync() is not sufficient to guarantee that your data is on stable
>>> storage and on MacOS X we provide a fcntl(), called F_FULLFSYNC,
>>> to ask the drive to flush all buffered data to stable storage.
>
>I've been looking for documentation on this without a lot of luck
>("man fcntl" on OS X 10.3.8 has certainly never heard of it).
>It's not completely clear whether this subsumes fsync() or whether
>you're supposed to fsync() and then use the fcntl.

My understanding is that you're supposed to fsync() and then use the
fcntl, but I'm not the filesystems expert. (Dominic, who wrote the
original message that I forwarded, is.)

I've filed a bug report asking for better documentation about this to
be placed in the fsync man page. <radar://4012378>

>Also, isn't it fundamentally at the wrong level? One would suppose that
>the drive flush operation is going to affect everything the drive
>currently has queued, not just the one file. That makes it difficult
>if not impossible to use efficiently.

I think the intent is to make the fcntl more accurate in time, as the
ability to do so appears in hardware.

One of the advantages Apple has is the ability to set very specific
requirements for our hardware. So if a block specific flush command
becomes part of the ATA spec, Apple can require vendors to support
it, and support it correctly, before using those drives.

On the other hand, as Dominic described, once the hardware is
external (like a firewire enclosure), we lose that leverage.

At 12:42 PM -0500 2/20/05, Greg Stark wrote:
>Dominic Giampaolo <dbg(at)apple(dot)com> writes:
>
>> > In most cases you do not need such a heavy handed operation and fsync() is
>> > good enough.
>
>Really? Can you think of a single application for which this definition of
>fsync is useful?
>
>Kernel buffers are transparent to the application, just as the disk buffer is.
>It doesn't matter to an application whether the data is sitting in a kernel
>buffer, or a buffer in the disk, it's equivalent. If fsync doesn't guarantee
>the writes actually end up on non-volatile disk then as far as the application
>is concerned it's just an expensive noop.

I think the intent of fsync() is closer to what you describe, but the
convention is that fsync() hands responsibility to the disk hardware.
That's how every other Unix seems to handle fsync() too. This gives
you good performance, and if you combine a smart fsync()ing
application with reliable storage hardware (like an XServe RAID that
battery backs it's own write caches), you get the best combination.

If you know you have unreliable hardware, and critical reliability
issues, then you can use the fcntl, which seems to be more control
than other OSes give.

-pmb

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Curt Sampson 2005-02-21 02:30:41 Time Zone Names Problem
Previous Message Tom Lane 2005-02-21 01:28:50 Re: SMP buffer management test question