Possible explanations for catastrophic performace deterioration?

From: Carlos Moreno <moreno_pg(at)mochima(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Possible explanations for catastrophic performace deterioration?
Date: 2007-09-23 16:45:15
Message-ID: 46F6981B.1010600@mochima.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


I recently had a puzzling experience (performace related).

Had a DB running presumably smoothly, on a server with Dual-Core
Opteron and 4GB of RAM (and SATA2 drives with Hardware RAID-1).
(PostgreSQL 8.2.4 installed from source, on a FC4 system --- databases
with no encoding --- initdb -E SQL_ASCII --no-locale, and all the
databases created with encoding SQL_ASCII)

We thought that performance had gone little by little down, but the
evidence now suggests that something must have triggered a big step
down in the performance of the server.

Thinking that it was simply a bottleneck with the hardware, we moved
to a different machine (lower performance CPU-wise, but with dual hard
disk, so I configured the pg_xlog directory on a partition on a separate
hard disk, estimating that this would take precedence over the lower CPU
power and the 2GB of RAM instead of 4).

Not only the performance was faster --- a query like:

select count(*) from customer

was *instantaneous* on the new machine (just after populating it,
without having even analyzed it!), and would take over a minute on
the old machine (the first time). Then, the second time, it would
take a little over two seconds on the old machine (at this point, both
machines had *zero* activity --- they were both essentially disconnected
from the outside world; serving exclusively my psql connection).

Funny thing, I dropped the database (on the old machine) and re-created
it with the backup I had just created, and now the performance on the
old one was again normal (the above query now gives me a result in
essentially no time --- same as on the new machine).

In retrospect, I'm now wondering if a vacuum full would have solved
the issue? (we do run vacuumdb -z --- vacuum analyze --- daily)

Any comments?? I'm worried that three months down the road we'll
face the same issue with this new server (that's about the time it took
since we had started running the other server until we noticed the
poor performance level) --- and we can not afford to completely stop
the system to drop-and-recreate the db on a regular basis.

Thanks,

Carlos
--

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jonah H. Harris 2007-09-23 16:56:35 Re: Possible explanations for catastrophic performace deterioration?
Previous Message Denes Daniel 2007-09-22 12:33:21 Re: Query planner unaware of possibly best plan