Re: Logarithmic change (decrease) in performance

From: Ron Peacetree <rjpeace(at)earthlink(dot)net>
To: Matthew Nuzum <mattnuzum(at)gmail(dot)com>, newz(at)bearfruit(dot)org, Postgresql Performance list <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Logarithmic change (decrease) in performance
Date: 2005-09-28 22:03:03
Message-ID: 12381712.1127944983296.JavaMail.root@elwamui-polski.atl.sa.earthlink.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

>From: Matthew Nuzum <mattnuzum(at)gmail(dot)com>
>Sent: Sep 28, 2005 4:02 PM
>Subject: [PERFORM] Logarithmic change (decrease) in performance
>
Small nit-pick: A "logarithmic decrease" in performance would be
a relatively good thing, being better than either a linear or
exponential decrease in performance. What you are describing is
the worst kind: an _exponential_ decrease in performance.

>Something interesting is going on. I wish I could show you the graphs,
>but I'm sure this will not be a surprise to the seasoned veterans.
>
>A particular application server I have has been running for over a
>year now. I've been logging cpu load since mid-april.
>
>It took 8 months or more to fall from excellent performance to
>"acceptable." Then, over the course of about 5 weeks it fell from
>"acceptable" to "so-so." Then, in the last four weeks it's gone from
>"so-so" to alarming.
>
>I've been working on this performance drop since Friday but it wasn't
>until I replied to Arnau's post earlier today that I remembered I'd
>been logging the server load. I grabbed the data and charted it in
>Excel and to my surprise, the graph of the server's load average looks
>kind of like the graph of y=x^2.
>
>I've got to make a recomendation for a solution to the PHB and my
>analysis is showing that as the dataset becomes larger, the amount of
>time the disk spends seeking is increasing. This causes processes to
>take longer to finish, which causes more processes to pile up, which
>causes processes to take longer to finish, which causes more processes
>to pile up etc. It is this growing dataset that seems to be the source
>of the sharp decrease in performance.
>
>I knew this day would come, but I'm actually quite surprised that when
>it came, there was little time between the warning and the grande
>finale. I guess this message is being sent to the list to serve as a
>warning to other data warehouse admins that when you reach your
>capacity, the downward spiral happens rather quickly.
>
Yep, definitely been where you are. Bottom line: you have to reduce
the sequential seeking behavior of the system to within an acceptable
window and then keep it there.

1= keep more of the data set in RAM
2= increase the size of your HD IO buffers
3= make your RAID sets wider (more parallel vs sequential IO)
4= reduce the atomic latency of your RAID sets
(time for Fibre Channel 15Krpm HD's vs 7.2Krpm SATA ones?)
5= make sure your data is as unfragmented as possible
6= change you DB schema to minimize the problem
a= overall good schema design
b= partitioning the data so that the system only has to manipulate a
reasonable chunk of it at a time.

In many cases, there's a number of ways to accomplish the above.
Unfortunately, most of them require CapEx.

Also, ITRW world such systems tend to have this as a chronic
problem. This is not a "fix it once and it goes away forever". This
is a part of the regular maintenance and upgrade plan(s).

Good Luck,
Ron

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Ron Peacetree 2005-09-28 23:49:59 Re: [PERFORM] A Better External Sort?
Previous Message Gavin Sherry 2005-09-28 22:01:55 Re: Slow concurrent update of same row in a given table