Re: Configuration Recommendations

From: Craig James <cjames(at)emolecules(dot)com>
To: Jan Nielsen <jan(dot)sture(dot)nielsen(at)gmail(dot)com>
Cc: John Lister <john(dot)lister(at)kickstone(dot)co(dot)uk>, sthomas(at)peak6(dot)com, pgsql-performance(at)postgresql(dot)org
Subject: Re: Configuration Recommendations
Date: 2012-05-03 15:46:25
Message-ID: CAFwQ8rf=4mPwDv04bRJzq=VTYKAg7xbFiY3+0_BpkFyya=Db6Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Thu, May 3, 2012 at 6:42 AM, Jan Nielsen <jan(dot)sture(dot)nielsen(at)gmail(dot)com> wrote:
> Hi John,
>
> On Thu, May 3, 2012 at 12:54 AM, John Lister <john(dot)lister(at)kickstone(dot)co(dot)uk>
> wrote:
>>
>> On 03/05/2012 03:10, Jan Nielsen wrote:
>>
>>
>> 300GB RAID10 2x15k drive for OS on local storage
>> */dev/sda1 RA*                                            4096
>> */dev/sda1 FS*                                            ext4
>> */dev/sda1 MO*
>>
>> 600GB RAID 10 8x15k drive for $PGDATA on SAN
>> *IO Scheduler sda*            noop anticipatory deadline [cfq]
>> */dev/sdb1 RA*                                            4096
>> */dev/sdb1 FS*                                             xfs
>> */dev/sdb1 MO*
>> allocsize=256m,attr2,logbufs=8,logbsize=256k,noatime
>>
>> 300GB RAID 10 2x15k drive for $PGDATA/pg_xlog on SAN
>> *IO Scheduler sdb*            noop anticipatory deadline [cfq]
>> */dev/sde1 RA*                                            4096
>> */dev/sde1 FS*                                             xfs
>> */dev/sde1 MO*        allocsize=256m,attr2,logbufs=8,logbsize=256k,noatime
>> *
>>
>>
>> I was wondering if it would be better to put the xlog on the same disk as
>> the OS? Apart from the occasional log writes I'd have thought most OS data
>> is loaded into cache at the beginning, so you effectively have an unused
>> disk. This gives you another spindle (mirrored) for your data.
>>
>> Or have I missed something fundamental?
>
>
> I followed Gregory Smith's arguments from PostgreSQL 9.0 High Performance,
> wherein he notes that WAL is sequential with constant cache flushes whereas
> OS is a mix of sequential and random with rare cache flushes. This (might)
> lead one to conclude that separating these would be good for at least the
> WAL and likely both. Regardless, separating these very different
> use-patterns seems like a "Good Thing" if tuning is ever needed for either.

Another consideration is journaling vs. non-journaling file systems.
If the WAL is on its own file system (not necessarily its own
spindle), you can use a non-journaling file system like ext2. The WAL
is actually quite small and is itself a journal, so there's no reason
to use a journaling file system. On the other hand, you don't want
the operating system on ext2 because it takes a long time to recover
from a crash.

I think you're right about the OS: once it starts, there is very
little disk activity. I'd say put both on the same disk but on
different partitions. The OS can use ext4 or some other modern
journaling file system, and the WAL can use ext2. This also means you
can put the WAL on the outer (fastest) part of the disk and leave the
slow inner tracks for the OS.

Craig

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Eyal Wilde 2012-05-03 17:07:55 Re: scale up (postgresql vs mssql)
Previous Message Rural Hunter 2012-05-03 15:42:30 Re: Query got slow from 9.0 to 9.1 upgrade