Re: Configuration Recommendations

From: Scott Carey <scott(at)richrelevance(dot)com>
To: Craig James <cjames(at)emolecules(dot)com>, Jan Nielsen <jan(dot)sture(dot)nielsen(at)gmail(dot)com>
Cc: John Lister <john(dot)lister(at)kickstone(dot)co(dot)uk>, "sthomas(at)peak6(dot)com" <sthomas(at)peak6(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Configuration Recommendations
Date: 2012-05-03 22:09:11
Message-ID: CBC84F7F.9B0A8%scott@richrelevance.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 5/3/12 8:46 AM, "Craig James" <cjames(at)emolecules(dot)com> wrote:

>On Thu, May 3, 2012 at 6:42 AM, Jan Nielsen <jan(dot)sture(dot)nielsen(at)gmail(dot)com>
>wrote:
>> Hi John,
>>
>> On Thu, May 3, 2012 at 12:54 AM, John Lister
>><john(dot)lister(at)kickstone(dot)co(dot)uk>
>> wrote:
>>>
>>> On 03/05/2012 03:10, Jan Nielsen wrote:
>>>
>>>
>>> 300GB RAID10 2x15k drive for OS on local storage
>>> */dev/sda1 RA* 4096
>>> */dev/sda1 FS* ext4
>>> */dev/sda1 MO*
>>>
>>> 600GB RAID 10 8x15k drive for $PGDATA on SAN
>>> *IO Scheduler sda* noop anticipatory deadline [cfq]
>>> */dev/sdb1 RA* 4096
>>> */dev/sdb1 FS* xfs
>>> */dev/sdb1 MO*
>>> allocsize=256m,attr2,logbufs=8,logbsize=256k,noatime
>>>
>>> 300GB RAID 10 2x15k drive for $PGDATA/pg_xlog on SAN
>>> *IO Scheduler sdb* noop anticipatory deadline [cfq]
>>> */dev/sde1 RA* 4096
>>> */dev/sde1 FS* xfs
>>> */dev/sde1 MO*
>>>allocsize=256m,attr2,logbufs=8,logbsize=256k,noatime
>>> *
>>>
>>>
>>> I was wondering if it would be better to put the xlog on the same disk
>>>as
>>> the OS? Apart from the occasional log writes I'd have thought most OS
>>>data
>>> is loaded into cache at the beginning, so you effectively have an
>>>unused
>>> disk. This gives you another spindle (mirrored) for your data.
>>>
>>> Or have I missed something fundamental?
>>
>>
>> I followed Gregory Smith's arguments from PostgreSQL 9.0 High
>>Performance,
>> wherein he notes that WAL is sequential with constant cache flushes
>>whereas
>> OS is a mix of sequential and random with rare cache flushes. This
>>(might)
>> lead one to conclude that separating these would be good for at least
>>the
>> WAL and likely both. Regardless, separating these very different
>> use-patterns seems like a "Good Thing" if tuning is ever needed for
>>either.
>
>Another consideration is journaling vs. non-journaling file systems.

Not really. ext4 with journaling on is faster than ext2 with it off.
ext2 should never be used if ext4 is available.

If you absolutely refuse to have a journal, turn the journal in ext4 off
and have a faster and safer file system than ext2.
ext2 should never be used if ext4 is available.

>If the WAL is on its own file system (not necessarily its own
>spindle), you can use a non-journaling file system like ext2. The WAL
>is actually quite small and is itself a journal, so there's no reason
>to use a journaling file system. On the other hand, you don't want
>the operating system on ext2 because it takes a long time to recover
>from a crash.
>
>I think you're right about the OS: once it starts, there is very
>little disk activity. I'd say put both on the same disk but on
>different partitions. The OS can use ext4 or some other modern
>journaling file system, and the WAL can use ext2. This also means you
>can put the WAL on the outer (fastest) part of the disk and leave the
>slow inner tracks for the OS.
>
>Craig
>
>--
>Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
>To make changes to your subscription:
>http://www.postgresql.org/mailpref/pgsql-performance

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Scott Carey 2012-05-03 22:16:54 Re: Configuration Recommendations
Previous Message John Lister 2012-05-03 20:27:34 Re: Configuration Recommendations