Re: What exactly is postgres doing during INSERT/UPDATE ?

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, Luke Koops <luke(dot)koops(at)entrust(dot)com>, Joseph S <jks(at)selectacast(dot)net>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: What exactly is postgres doing during INSERT/UPDATE ?
Date: 2009-08-30 15:40:01
Message-ID: b42b73150908300840o354b345fje4adf973f94a1ec2@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Sat, Aug 29, 2009 at 9:59 AM, Scott Marlowe<scott(dot)marlowe(at)gmail(dot)com> wrote:
> On Sat, Aug 29, 2009 at 2:46 AM, Greg Stark<gsstark(at)mit(dot)edu> wrote:
>> On Sat, Aug 29, 2009 at 5:20 AM, Luke Koops<luke(dot)koops(at)entrust(dot)com> wrote:
>>> Joseph S Wrote
>>>> If I have 14 drives in a RAID 10 to split between data tables
>>>> and indexes what would be the best way to allocate the drives
>>>> for performance?
>>>
>>> RAID-5 can be much faster than RAID-10 for random reads and writes.  It is much slower than RAID-10 for sequential writes, but about the same for sequential reads.  For typical access patterns, I would put the data and indexes on RAID-5 unless you expect there to be lots of sequential scans.
>>
>> That's pretty much exactly backwards. RAID-5 will at best slightly
>> slower than RAID-0 or RAID-10 for sequential reads or random reads.
>> For sequential writes it performs *terribly*, especially for random
>> writes. The only write pattern where it performs ok sometimes is
>> sequential writes of large chunks.
>
> Note that while RAID-10 is theoretically always better than RAID-5,
> I've run into quite a few cheapie controllers that were heavily
> optimised for RAID-5 and de-optimised for RAID-10.  However, if it's
> got battery backed cache and can run in JBOD mode, linux software
> RAID-10 or hybrid RAID-1 in hardware RAID-0 in software will almost
> always beat hardware RAID-5 on the same controller.

raid 5 can outperform raid 10 on sequential writes in theory. if you
are writing 100mb of actual data on, say, a 8 drive array, the raid 10
system has to write 200mb data and the raid 5 system has to write 100
* (8/7) or about 114mb. Of course, the raid 5 system has to do
parity, etc.

For random writes, raid 5 has to write a minimum of two drives, the
data being written and parity. Raid 10 also has to write two drives
minimum. A lot of people think parity is a big deal in terms of raid
5 performance penalty, but I don't -- relative to the what's going on
in the drive, xor calculation costs (one of the fastest operations in
computing) are basically zero, and off-lined if you have a hardware
raid controller.

I bet part of the problem with raid 5 is actually contention. since
your write to a stripe can conflict with other writes to a different
stripe. The other problem with raid 5 that I see is that you don't
get very much extra protection -- it's pretty scary doing a rebuild
even with a hot spare (and then you should probably be doing raid 6).
On read performance RAID 10 wins all day long because more drives can
be involved.

merlin

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Greg Stark 2009-08-30 15:52:07 Re: What exactly is postgres doing during INSERT/UPDATE ?
Previous Message Jeff Janes 2009-08-29 22:01:03 Re: What exactly is postgres doing during INSERT/UPDATE ?