Skip site navigation (1) Skip section navigation (2)

Re: 12 disks raid setup

From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Franck Routier <franck(dot)routier(at)axege(dot)com>
Cc: pgsql-performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: 12 disks raid setup
Date: 2008-02-29 15:41:08
Message-ID: Pine.GSO.4.64.0802291019570.28452@westnet.com (view raw or flat)
Thread:
Lists: pgsql-performance
On Fri, 29 Feb 2008, Franck Routier wrote:

> my Raid controller is an Adaptec 31205 SAS/RAID controller. The battery
> was an option, but I didn't know it at purchase time. So I have no
> battery, but the whole system is on an UPS.

The UPS is of no help here.  The problem is that PostgreSQL forces the 
disk controller to commit WAL writes to disk after every transaction.  If 
you have a controller with a battery-backed cache, you can use that cache 
to buffer those writes and dramatically increase write performance.  The 
USP doesn't give you the same write guarantees.  Let's say someone trips 
over the server power cord (simplest example of a whole class of 
failures).  With the BBC controller, the cached writes will get committed 
when you plug the server back in.  If all you've got is a UPS, writes that 
didn't make it to disk before the outage are lost.  That means you can't 
buffer those writes without risking database corruption.

The general guideline here is that if you don't have a battery-backed 
cache on your controller, based on disk rotation speed you'll be limited 
to around 100 (7200 RPM) to 200 (15K RPM) commits/second per single 
client, with each commit facing around a 2-4ms delay.  That rises to 
perhaps 500/s total with lots of clients.  BBC configurations can easily 
clear 3000/s total and individual commits don't have that couple of ms 
delay.

> So in fact I think I will use md raid, but still don't know with which 
> layout (3x4 or 1x12).

The only real downside of md RAID is that if you lose the boot device it 
can be tricky to get the system to start again; hardware RAID hides that 
little detail from the BIOS.  Make sure you simulate a failure of the 
primary boot drive and are comfortable with recovering from that situation 
before you go into production with md.

The only way to know which layout will work better is to have a lot of 
knowledge of this application and how it bottlenecks under load.  If you 
know, for example, that there's a particular set of tables/indexes that 
are critical to real-time users, whereas others are only used by batch 
operations, things like that can be used to figure out how to optimize 
disk layout.  If you don't know your database to that level, put 
everything into one big array and forget about it; you won't do any better 
than that.

> What would you suggest as a benchmarking method ? Simply issue a few big
> queries that I expect to be usual and see how long it last, or is there
> a more convinient and or "scientific" method ?

Benchmarking is hard and you have to use a whole array of tests if you 
want to quantify the many aspects of performance.  You're doing the right 
thing using bonnie++ to quantify disk speed.  If you've got some typical 
queries, using those to fine-tune postgresql.conf parameters is a good 
idea; just make sure to set shared_buffers, estimated_cache_size, and run 
ANALYZE on your tables.  Be careful to note performance differences when 
the cache is already filled with data from previous runs.  Measuring 
write/commit performance is probably easiest using pgbench.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD

In response to

Responses

pgsql-performance by date

Next:From: Jignesh K. ShahDate: 2008-02-29 17:17:29
Subject: Re: 12 disks raid setup
Previous:From: Franck RoutierDate: 2008-02-29 14:51:45
Subject: Re: 12 disks raid setup

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group