Re: IDE Drives and fsync

From: Manfred Spraul <manfred(at)colorfullife(dot)com>
To: "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: IDE Drives and fsync
Date: 2003-10-08 16:50:21
Message-ID: 3F84404D.8040303@colorfullife.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

scott.marlowe wrote:

>OK, I've done some more testing on our IDE drive machine.
>
>First, some background. The hard drives we're using are Seagate
>drives, model number ST380023A. Firmware version is 3.33. The machine
>they are in is running RH9. The setup string I'm feeding them on startup
>right now is: hdparm -c3 -f -W1 /dev/hdx
>
>where:
>
>-c3 sets I/O to 32 bit w/sync (uh huh, sure...)
>
sync has nothing to do with sync to disk. The sync means read from three
magic io ports before transfering data to or from the device.

>-f sets the drive to flush buffer cache on exit
>
-f shouldn't have any effect: it means that the buffer cache in the OS
is flushed after hdparm exits, it has no long-term effect on the disk.

>-W1 turns on write caching
>
That's the problem: turning on write caching causes corruptions.
What's needed is partial write caching: write cache on, and fsync()
sends a barrier to the disk, and only after the disk reports that the
barrier is completed, then fsync() returns.
I consider that an OS/driver problem, not a problem for postgres.

>The drives come up using DMA. turning unmask IRQ on / off has no affect
>on the tests I've been performaing.
>
>
Of course. irq unmasking is about interrupt latency if DMA is not used:
DMA off and dma masking off results in dropped bytes on serial links.

>Without the -f switch, data corruption due to sudden power down is an
>almost certain.
>
It's odd that adding -f reduces the corruptions - probably it changes
available memory, and thus the writeback of data from kernel to disk.

>Tom, you had mentioned adding a delay of some kind to the fsync logic, and
>I'd be more than willing to try out any patch you'd like to toss out to me
>to see if we can get a semi-stable behaviour out of IDE drives with the
>-W1 and -f switches turned on.
>
I'm not aware that there is any safe delay. Disks with write caches
reorder io operations, and some hold back write operations indefinitively.

Unfortunately Linux doesn't implement write barriers, and the support in
some IDE disks is missing, too :-(

--
Manfred

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2003-10-08 16:56:24 setlocale
Previous Message Neil Conway 2003-10-08 16:41:56 Re: Sun performance - Major discovery!