Skip site navigation (1) Skip section navigation (2)

Re: new server I/O setup

From: "Fernando Hevia" <fhevia(at)ip-tel(dot)com(dot)ar>
To: "'Greg Smith'" <greg(at)2ndquadrant(dot)com>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: new server I/O setup
Date: 2010-01-15 16:51:09
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-performance

> -----Mensaje original-----
> De: Greg Smith
>> Fernando Hevia wrote: 
>> 	I justified my first choice in that WAL writes are 
>> sequentially and OS pretty much are too, so a RAID 1 probably 
>> would hold ground against a 12 disc RAID 10 with random writes.
> The problem with this theory is that when PostgreSQL does WAL 
> writes and asks to sync the data, you'll probably discover 
> all of the open OS writes that were sitting in the Linux 
> write cache getting flushed before that happens.  And that 
> could lead to horrible performance--good luck if the database 
> tries to do something after cron kicks off updatedb each 
> night for example.

I actually hadn't considered such a scenario. It probably wont hit us
because our real-time activity diminishes abruptly overnight when
maintainance routines kick in.
But in case this proves to be an issue, disabling synchronous_commit should
help out, and thanks to the BBU cache the risk of lost transactions should
be very low. In any case I would leave it on till the issue arises. Do you

In our business worst case situation could translate to losing a couple
seconds worth of call records, all recoverable from secondary storage.

> I think there are two viable configurations you should be 
> considering you haven't thought about:
> , but neither is quite what you're looking at:
> 2 discs in RAID 1 for OS
> 2 discs in RAID 1 for pg_xlog
> 10 discs in RAID 10 for postgres, ext3
> 2 spares.
> 14 discs in RAID 10 for everything
> 2 spares.
> Impossible to say which of the four possibilities here will 
> work out better.  I tend to lean toward the first one I 
> listed above because it makes it very easy to monitor the 
> pg_xlog activity (and the non-database activity) separately 
> from everything else, and having no other writes going on 
> makes it very unlikely that the pg_xlog will ever become a 
> bottleneck.  But if you've got 14 disks in there, it's 
> unlikely to be a bottleneck anyway.  The second config above 
> will get you slightly better random I/O though, so for 
> workloads that are really limited on that there's a good 
> reason to prefer it.

Beside the random writing, we have quite intensive random reads too. I need
to maximize throughput on the RAID 10 array and it makes me feel rather
uneasy the thought of taking 2 more disks from it.
I did consider the 14 disks RAID 10 for all since it's very attractive for
read I/O. But with 12 spins read I/O should be incredibly fast for us
considering our current production server has a meager 4 disk raid 10.
I still think the 2d RAID 1 + 12d RAID 10 will be the best combination for
write throughput, providing the RAID 1 can keep pace with the RAID 10,
something Scott already confirmed to be his experience.

> Also:  the whole "use ext2 for the pg_xlog" idea is overrated 
> far as I'm concerned.  I start with ext3, and only if I get 
> evidence that the drive is a bottleneck do I ever think of 
> reverting to unjournaled writes just to get a little speed 
> boost.  In practice I suspect you'll see no benchmark 
> difference, and will instead curse the decision the first 
> time your server is restarted badly and it gets stuck at fsck.

This advice could be interpreted as "start safe and take risks only if
I think you are right and will follow it.

>> 	Pd: any clue if hdparm works to deactive the disks 
>> write cache even if they are behind the 3ware controller?
> You don't use hdparm for that sort of thing; you need to use 
> 3ware's tw_cli utility.  I believe that the individual drive 
> caches are always disabled, but whether the controller cache 
> is turned on or not depends on whether the card has a 
> battery.  The behavior here is kind of weird though--it 
> changes if you're in RAID mode vs. JBOD mode, so be careful 
> to look at what all the settings are.  Some of these 3ware 
> cards default to extremely aggressive background scanning for 
> bad blocks too, you might have to tweak that downward too.

It has a battery and it is working in RAID mode. 
It's also my first experience with a hardware controller. Im installing
tw_cli at this moment.

Greg, I hold your knowledge in this area in very high regard. 
Your comments are much appreciated.


In response to

pgsql-performance by date

Next:From: Tom LaneDate: 2010-01-15 16:54:53
Subject: Re: New server to improve performance on our large and busy DB - advice? (v2)
Previous:From: marcin mankDate: 2010-01-15 16:48:32
Subject: Re: New server to improve performance on our large and busy DB - advice?

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group