Re: Peformance Tuning Opterons/ Hard Disk Layout

From: John Arbash Meinel <john(at)arbash-meinel(dot)com>
To: John Allgood <john(at)turbocorp(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruno Almeida do Lago <teolupus(at)gmail(dot)com>, "'Michael Adler'" <adler(at)pobox(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Peformance Tuning Opterons/ Hard Disk Layout
Date: 2005-02-23 19:46:11
Message-ID: 421CDD83.3070609@arbash-meinel.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

John Allgood wrote:

> I think maybe I didn't explain myself well enough. At most we will
> service 200-250 connections across all the 9 databases mentioned. The
> database we are building is for a trucking company. Each of the
> databases represents a different division. With one master database
> that everything is updated to. Most of the access to the database is
> by simple queries. Most of the IO intensive stuff is done when revenue
> reports are generated and when we have our month/year end processing.
> All the trucking loads that are mark as delivered are transferred to
> our master database during night time processing. All that will be
> handled using custom scripts. Maybe I have given a better explanation
> of the application. my biggest concern is how to partition the shared
> storage for maximum performance. Is there a real benifit to having
> more that one raid5 partition or am I wasting my time.
>
> Thanks
>
> John Allgood - ESC

If you read the general advice statements, it's actually better to not
use raid5, but to use raid10 (striping and mirroring). Simply because
raid5 writing is quite poor.

Also, if you have the disks, the next best improvement is to move
pg_xlog onto it's own set of disks. I think that gets as much as 25%
improvement by itself. pg_xlog is an append process, which must complete
before the actual data gets updated, so giving it it's own set of
spindles reduces seek time, and lets the log be written quickly.
I think there is even some benefit to making pg_xlog be a solid state
disk, as it doesn't have to be huge, but having high I/O rates can
remove it as a bottleneck. (I'm not positive how large pg_xlog gets, but
it is probably small compared with the total db size, and I think it can
be flushed periodically as transactions are completed.)

I'm not sure what you are considering "shared storage". Are you thinking
that all the machines will be mounting a remote drive for writing the
DB? They should all have their own local copy (multiple masters on the
same database is not supported).

I think it is possible to get better performance by having more raid
systems. But it is application dependent. If you know that you have 2
tables that are being updated often and independently, then having each
one on it's own raid would allow better concurrency.

But it sounds like in your app you get concurrency by having a bunch of
remote databases, which then do bulk updates on the master database. I
think if you are careful to have each of the remote dbs update the
master at a slightly different time, you could probably get very good
transaction rates.

John
=:->

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Michael Adler 2005-02-23 19:50:59 Re: Peformance Tuning Opterons/ Hard Disk Layout
Previous Message Matthew T. O'Connor 2005-02-23 19:18:08 Re: is pg_autovacuum so effective ?