Re: SAN performance mystery

From: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
To: Tim Allen <tim(at)proximity(dot)com(dot)au>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: SAN performance mystery
Date: 2006-06-16 07:48:00
Message-ID: 44926230.6060304@kaltenbrunner.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Tim Allen wrote:
> We have a customer who are having performance problems. They have a
> large (36G+) postgres 8.1.3 database installed on an 8-way opteron with
> 8G RAM, attached to an EMC SAN via fibre-channel (I don't have details
> of the EMC SAN model, or the type of fibre-channel card at the moment).
> They're running RedHat ES3 (which means a 2.4.something Linux kernel).
>
> They are unhappy about their query performance. We've been doing various
> things to try to work out what we can do. One thing that has been
> apparent is that autovacuum has not been able to keep the database
> sufficiently tamed. A pg_dump/pg_restore cycle reduced the total
> database size from 81G to 36G. Performing the restore took about 23 hours.

Hi Tim!

to give you some comparision - we have a similiar sized database here
(~38GB after a fresh restore and ~76GB after some months into
production). the server is a 4 core Opteron @2,4Ghz with 16GB RAM,
connected via 2 QLogic 2Gbit HBA's to the SAN (IBM DS4300 Turbo).

It took us quite a while to get this combination up to speed but a full
dump&restore cycle (via a pg_dump | psql pipe over the net) now takes
only about an hour.
23 hours or even 5 hours sounds really excessive - I'm wondering about
some basic issues with the SAN.
If you are using any kind of multipathing (most likely the one in the
QLA-drivers) I would at first assume that you are playing ping-pong
between the controllers (ie the FC-cards do send IO to more than one
SAN-head causing those to failover constantly completely destroying
performance).
ES3 is rather old too and I don't think that even their hacked up kernel
is very good at driving a large Opteron SMP box (2.6 should be MUCH
better in that regard).

Other than that - how well is your postgresql instance tuned to your
hardware ?

Stefan

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message David Leangen 2006-06-16 08:12:22 Re: Delete operation VERY slow...
Previous Message Gourish Singbal 2006-06-16 07:22:49 Re: Delete operation VERY slow...