Re: File Systems Compared

From: Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>
To: Bruno Wolff III <bruno(at)wolff(dot)to>
Subject: Re: File Systems Compared
Date: 2006-12-14 21:21:11
Message-ID: 4581C047.8010208@cheapcomplexdevices.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Bruno Wolff III wrote:
> On Thu, Dec 14, 2006 at 01:39:00 -0500,
> Jim Nasby <decibel(at)decibel(dot)org> wrote:
>> On Dec 11, 2006, at 12:54 PM, Bruno Wolff III wrote:
>>> This appears to be changing under Linux. Recent kernels have write
>>> barriers implemented using cache flush commands (which
>>> some drives ignore, so you need to be careful).

Is it true that some drives ignore this; or is it mostly
an urban legend that was started by testers that didn't
have kernels with write barrier support. I'd be especially
interested in knowing if there are any currently available
drives which ignore those commands.

>>> In very recent kernels, software raid using raid 1 will also
>>> handle write barriers. To get this feature, you are supposed to
>>> mount ext3 file systems with the barrier=1 option. For other file
>>> systems, the parameter may need to be different.

With XFS the default is apparently to enable write barrier
support unless you explicitly disable it with the nobarrier mount option.
It also will warn you in the system log if the underlying device
doesn't have write barrier support.

SGI recommends that you use the "nobarrier" mount option if you do
have a persistent (battery backed) write cache on your raid device.

http://oss.sgi.com/projects/xfs/faq.html#wcache

>> But would that actually provide a meaningful benefit? When you
>> COMMIT, the WAL data must hit non-volatile storage of some kind,
>> which without a BBU or something similar, means hitting the platter.
>> So I don't see how enabling the disk cache will help, unless of
>> course it's ignoring fsync.

With write barriers, fsync() waits for the physical disk; but I believe
the background writes from write() done by pdflush don't have to; so
it's kinda like only disabling the cache for WAL files and the filesystem's
journal, but having it enabled for the rest of your write activity (the
tables except at checkpoints? the log file?).

> Note the use case for this is more for hobbiests or development boxes. You can
> only use it on software raid (md) 1, which rules out most "real" systems.
>

Ugh. Looking for where that's documented; and hoping it is or will soon
work on software 1+0 as well.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2006-12-14 21:23:16 Re: Insertion to temp table deteriorating over time
Previous Message Tom Lane 2006-12-14 21:10:03 Re: [PERFORM] 8.2rc1 (much) slower than 8.2dev?