Skip site navigation (1) Skip section navigation (2)

Re: RAID controllers for Postgresql on large setups

From: Marinos Yannikos <mjy(at)geizhals(dot)at>
To: Pgsql performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: RAID controllers for Postgresql on large setups
Date: 2008-05-26 22:44:05
Message-ID: 483B3D35.2070701@geizhals.at (view raw or flat)
Thread:
Lists: pgsql-performance
PFC schrieb:
>     PCI limits you to 133 MB/s (theoretical), actual speed being around 
> 100-110 MB/s.

"Current" PCI 2.1+ implementations allow 533MB/s (32bit) to 1066MB/s 
(64bit) since 6-7 years ago or so.

>     For instance here I have a box with PCI, Giga Ethernet and a 
> software RAID5 ; reading from the RAID5 goes to about 110 MB/s (actual 
> disk bandwidth is closer to 250 but it's wasted) ; however when using 
> the giga ethernet to copy a large file over a LAN, disk and ethernet 
> have to share the PCI bus, so throughput falls to 50 MB/s.  Crummy, eh ?

Sounds like a slow Giga Ethernet NIC...

>     Let me repeat this : at the current state of SATA drives, just TWO 
> of them is enough to saturate a PCI bus. I'm speaking desktop SATA 
> drives, not high-end SCSI ! (which is not necessarily faster for pure 
> throughput anyway).
>     Adding more drives will help random reads/writes but do nothing for 
> throughput since the tiny PCI pipe is choking.

In my experience, SATA drives are very slow for typical database work 
(which is heavy on random writes). They often have very slow access 
times, bad or missing NCQ implementation (controllers / SANs as well) 
and while I am not very familiar with the protocol differences, they 
seem to add a hell of a lot more latency than even old U320 SCSI drives.

Sequential transfer performance is a nice indicator, but not very 
useful, since most serious RAID arrays will have bottlenecks other than 
the theoretical cumulated transfer rate of all the drives (from 
controller cache speed to SCSI bus to fibre channel). Thus, lower 
sequential transfer rate and lower access times scale much better.

>> Whether a SAN or just an external enclosure is 12disk enough to 
>> substain 5K inserts/updates per second on rows in the 30 to 90bytes 
>> territory? At 5K/second inserting/updating 100 Million records would 
>> take 5.5 hours. That is fairly reasonable if we can achieve. Faster 
>> would be better, but it depends on what it would cost to achieve.

5K/s inserts (with no indexes) are easy with PostgreSQL and typical 
(current) hardware. We are copying about 175K rows/s with our current 
server (Quad core Xeon 2.93GHz, lots of RAM, meagre performance SATA SAN 
with RAID-5 but 2GB writeback cache). Rows are around 570b each on 
average. Performance is CPU-bound with a typical number of indexes on 
the table and much lower than 175K/s though, for single row updates we 
get about 9K/s per thread (=5.6MB/s) and that's 100% CPU-bound on the 
server - if we had to max this out, we'd thus use several clients in 
parallel and/or collect inserts in text files and make bulk updates 
using COPY. The slow SAN isn't a problem now.

Our SATA SAN suffers greatly when reads are interspersed with writes, 
for that you want more spindles and faster disks.

To the OP I have 1 hearty recommendation: if you are using the 
RAID-functionality of the 2120, get rid of it. If you can wipe the 
disks, try using Linux software-RAID (yes, it's an admin's nightmare 
etc. but should give much better performance even though the 2120's 
plain SCSI won't be hot either) and then start tuning your PostgreSQL 
installation (there's much to gain here). Your setup looks decent 
otherwise for what you are trying to do (but you need a fast CPU) and 
your cheapest upgrade path would be a decent RAID controller or at least 
a decent non-RAID SCSI controller for software-RAID (at least 2 ports 
for 12 disks), although the plain PCI market is dead.

-mjy

In response to

pgsql-performance by date

Next:From: markDate: 2008-05-26 22:49:35
Subject: select query takes 13 seconds to run with index
Previous:From: Heikki LinnakangasDate: 2008-05-26 20:26:15
Subject: Re: Symbolic Links to Tablespaces

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group