Re: new server I/O setup

From: "Fernando Hevia" <fhevia(at)ip-tel(dot)com(dot)ar>
To: "'Greg Smith'" <greg(at)2ndquadrant(dot)com>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: new server I/O setup
Date: 2010-01-15 16:51:09
Message-ID: BAF02C60ADD0459AA71D73482310FA6D@iptel.com.ar
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

> -----Mensaje original-----
> De: Greg Smith
>
>> Fernando Hevia wrote:
>>
>> I justified my first choice in that WAL writes are
>> sequentially and OS pretty much are too, so a RAID 1 probably
>> would hold ground against a 12 disc RAID 10 with random writes.
>>
>
> The problem with this theory is that when PostgreSQL does WAL
> writes and asks to sync the data, you'll probably discover
> all of the open OS writes that were sitting in the Linux
> write cache getting flushed before that happens. And that
> could lead to horrible performance--good luck if the database
> tries to do something after cron kicks off updatedb each
> night for example.
>

I actually hadn't considered such a scenario. It probably wont hit us
because our real-time activity diminishes abruptly overnight when
maintainance routines kick in.
But in case this proves to be an issue, disabling synchronous_commit should
help out, and thanks to the BBU cache the risk of lost transactions should
be very low. In any case I would leave it on till the issue arises. Do you
agree?

In our business worst case situation could translate to losing a couple
seconds worth of call records, all recoverable from secondary storage.

> I think there are two viable configurations you should be
> considering you haven't thought about:
> , but neither is quite what you're looking at:
>
> 2 discs in RAID 1 for OS
> 2 discs in RAID 1 for pg_xlog
> 10 discs in RAID 10 for postgres, ext3
> 2 spares.
>
> 14 discs in RAID 10 for everything
> 2 spares.
>
> Impossible to say which of the four possibilities here will
> work out better. I tend to lean toward the first one I
> listed above because it makes it very easy to monitor the
> pg_xlog activity (and the non-database activity) separately
> from everything else, and having no other writes going on
> makes it very unlikely that the pg_xlog will ever become a
> bottleneck. But if you've got 14 disks in there, it's
> unlikely to be a bottleneck anyway. The second config above
> will get you slightly better random I/O though, so for
> workloads that are really limited on that there's a good
> reason to prefer it.
>

Beside the random writing, we have quite intensive random reads too. I need
to maximize throughput on the RAID 10 array and it makes me feel rather
uneasy the thought of taking 2 more disks from it.
I did consider the 14 disks RAID 10 for all since it's very attractive for
read I/O. But with 12 spins read I/O should be incredibly fast for us
considering our current production server has a meager 4 disk raid 10.
I still think the 2d RAID 1 + 12d RAID 10 will be the best combination for
write throughput, providing the RAID 1 can keep pace with the RAID 10,
something Scott already confirmed to be his experience.

> Also: the whole "use ext2 for the pg_xlog" idea is overrated
> far as I'm concerned. I start with ext3, and only if I get
> evidence that the drive is a bottleneck do I ever think of
> reverting to unjournaled writes just to get a little speed
> boost. In practice I suspect you'll see no benchmark
> difference, and will instead curse the decision the first
> time your server is restarted badly and it gets stuck at fsck.
>

This advice could be interpreted as "start safe and take risks only if
needed"
I think you are right and will follow it.

>> Pd: any clue if hdparm works to deactive the disks
>> write cache even if they are behind the 3ware controller?
>>
>
> You don't use hdparm for that sort of thing; you need to use
> 3ware's tw_cli utility. I believe that the individual drive
> caches are always disabled, but whether the controller cache
> is turned on or not depends on whether the card has a
> battery. The behavior here is kind of weird though--it
> changes if you're in RAID mode vs. JBOD mode, so be careful
> to look at what all the settings are. Some of these 3ware
> cards default to extremely aggressive background scanning for
> bad blocks too, you might have to tweak that downward too.
>

It has a battery and it is working in RAID mode.
It's also my first experience with a hardware controller. Im installing
tw_cli at this moment.

Greg, I hold your knowledge in this area in very high regard.
Your comments are much appreciated.

Thanks,
Fernando

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2010-01-15 16:54:53 Re: New server to improve performance on our large and busy DB - advice? (v2)
Previous Message marcin mank 2010-01-15 16:48:32 Re: New server to improve performance on our large and busy DB - advice?