Re: What exactly is postgres doing during INSERT/UPDATE ?

From: Luke Koops <luke(dot)koops(at)entrust(dot)com>
To: 'Joseph S' <jks(at)selectacast(dot)net>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: What exactly is postgres doing during INSERT/UPDATE ?
Date: 2009-08-29 04:20:30
Message-ID: A3144629B5AC714A8BF27806EBFA70575146234D@sottexch7.corp.ad.entrust.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Joseph S Wrote
> If I have 14 drives in a RAID 10 to split between data tables
> and indexes what would be the best way to allocate the drives
> for performance?

RAID-5 can be much faster than RAID-10 for random reads and writes. It is much slower than RAID-10 for sequential writes, but about the same for sequential reads. For typical access patterns, I would put the data and indexes on RAID-5 unless you expect there to be lots of sequential scans.

If you do this, you can drop the random_page_cost from the default 4.0 to 1.0. That should also encourage postgres to use the index more often. I think the default costs for postgres assume that the data is on a RAID-1 array. Either that, or they are a compromise that isn't quite right for any system. On a plain old disk the random_page_cost should be 8.0 or 10.0.

The division of the drives into two arrays would depend on how much space will be occupied by the tables vs the indexes. This is very specific to your database. For example, if indexes take half as much space as tables, then you want 2/3rds for tables and 1/3rd for indexes. 8 drives for tables, 5 drives for indexes, and 1 for a hot standby. The smaller array may be a bit slower for some operations due to reduced parallelism. This also depends on the intelligence of your RAID controller.

Always put the transaction logs (WAL Files) on RAID-10 (or RAID-1 if you don't want to dedicate so many drives to the logs). The only significant performance difference between RAID-10 and RAID-1 is that RAID-1 is much slower (factor of 4 or 5) for random reads. I think the ratio of random reads from the transaction logs would typically be quite low. They are written sequentially and during checkpoint they are read sequentially. In the interim, the data is probably still in shared memory if it needs to be read.

You don't want your transaction logs or any swapfiles on RAID-5. The slow sequential write performance can be a killer.

-Luke

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Greg Stark 2009-08-29 08:46:15 Re: What exactly is postgres doing during INSERT/UPDATE ?
Previous Message Jeff Janes 2009-08-29 00:19:17 Re: What exactly is postgres doing during INSERT/UPDATE ?