Re: What exactly is postgres doing during INSERT/UPDATE ?

From: David Rees <drees76(at)gmail(dot)com>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Luke Koops <luke(dot)koops(at)entrust(dot)com>, Joseph S <jks(at)selectacast(dot)net>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: What exactly is postgres doing during INSERT/UPDATE ?
Date: 2009-08-29 19:52:07
Message-ID: 72dbd3150908291252k373f7aabo9f60009580b8b868@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Sat, Aug 29, 2009 at 1:46 AM, Greg Stark<gsstark(at)mit(dot)edu> wrote:
> On Sat, Aug 29, 2009 at 5:20 AM, Luke Koops<luke(dot)koops(at)entrust(dot)com> wrote:
>> RAID-5 can be much faster than RAID-10 for random reads and writes.  It is much slower than
>> RAID-10 for sequential writes, but about the same for sequential reads.  For typical access
>> patterns, I would put the data and indexes on RAID-5 unless you expect there to be lots of
>> sequential scans.
>
> That's pretty much exactly backwards. RAID-5 will at best slightly
> slower than RAID-0 or RAID-10 for sequential reads or random reads.
> For sequential writes it performs *terribly*, especially for random
> writes. The only write pattern where it performs ok sometimes is
> sequential writes of large chunks.

Also note that how terribly RAID5 performs on those small random
writes depends on a LOT on the implementation. A good controller with
a large BBU cache will be able to mitigate the performance penalty of
having to read stripes before small writes to calculate parity (of
course, if the writes are really random enough, it's still not going
to help much).

>> Always put the transaction logs (WAL Files) on RAID-10 (or RAID-1 if you don't want to dedicate
>> so many drives to the logs).  The only significant performance difference between RAID-10 and
>> RAID-1 is that RAID-1 is much slower (factor of 4 or 5) for random reads.
>
> no, RAID-10 and RAID-1 should perform the same for reads. RAID-10 will
> be slower at writes by about a factor equal to the number of mirror
> sides.

Let's keep in mind that a 2-disk RAID-10 is really the same as a
2-disk RAID-1, it just doesn't have any mirrors to stripe over. So
since you really need 4-disks for a "true" RAID-10, the performance of
a RAID-10 array compared to a RAID1 array is pretty much proportional
to the number of disks in the array (more disks = more performance).

The "far" RAID-10 layout that is available when using Linux software
raid is interesting. It will lay the data out on the disks so that
you can get the streaming read performance of a RAID-0 array, but
streaming write performance will suffer a bit since now the disk will
have to seek to perform those writes. You can also use this layout
with just 2 disks instead of RAID1. Some claim that the performance
hit isn't noticeable due to write caching/IO ordering, but I have not
tested it's performance using PostgreSQL. Might be a nice thing for
someone to try.

http://en.wikipedia.org/wiki/Non-standard_RAID_levels#Linux_MD_RAID_10

-Dave

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Jean-Michel Pouré 2009-08-29 20:59:46 Re: PostgreSQL does CAST implicitely between int and a domain derived from int
Previous Message Tom Lane 2009-08-29 17:44:08 Re: PostgreSQL does CAST implicitely between int and a domain derived from int