Re: Strange performance degradation

From: Guillaume Cottenceau <gc(at)mnc(dot)ch>
To: Lorenzo Allegrucci <lorenzo(dot)allegrucci(at)forinicom(dot)it>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Strange performance degradation
Date: 2009-11-20 11:17:07
Message-ID: 878we1igy4.fsf@meuh.mnc.lan
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-performance

Lorenzo Allegrucci <lorenzo.allegrucci 'at' forinicom.it> writes:

> A. Kretschmer wrote:
>> In response to Lorenzo Allegrucci :
>>> Hi all,
>>>
>>> I'm experiencing a strange behavior with my postgresql 8.3:
>>> performance is degrading after 3/4 days of running time but if I
>>> just restart it performance returns back to it's normal value..
>>> In normal conditions the postgres process uses about 3% of cpu time
>>> but when is in "degraded" conditions it can use up to 25% of cpu time.
>>> The load of my server is composed of many INSERTs on a table, and
>>> many UPDATEs and SELECT on another table, no DELETEs.
>>> I tried to run vacuum by the pg_maintenance script (Debian Lenny)
>>> but it doesn't help. (I have autovacuum off).
>>
>> Bad idea. Really.
>
> Why running vacuum by hand is a bad idea?

It's rather turning autovacuum off which is a bad idea.

> vacuum doesn't solve anyway, it seems only a plain restart stops the
> performance degradation.

Notice: normally, restarting doesn't help for vacuum-related
problems.

Your degradation might come from a big request being intensive on
PG's and OS's caches, resulting in data useful to other requests
getting farther (but it should get back to normal if the big
request is not performed again). And btw, 25% is far from 100% so
response time should be the same if there are no other factors;
you should rather have a look at IOs (top, vmstat, iostat) during
problematic time. How do you measure your degradation, btw?

>>> So, my main question is.. how can just a plain simple restart of postgres
>>> restore the original performance (3% cpu time)?
>>
>> You should enable autovacuum.
>>
>> And you should run vacuum verbose manually and see the output.
>
> below is the output of vacuum analyze verbose
> (NOTE: I've already run vacuum this morning, this is a second run)
>
> DETAIL: A total of 58224 page slots are in use (including overhead).
> 58224 page slots are required to track all free space.
> Current limits are: 2000000 page slots, 1000 relations, using 11784 kB.

Which means your FSM settings look fine; but doesn't mean your
database is not bloated (and with many UPDATEs and no correct
vacuuming, it should be bloated). One way to know is to restore a
recent backup, issue VACUUM VERBOSE on a table known to be large
and regularly UPDATE's/DELETE'd on both databases (in production,
and on the restore) and compare the reported number of pages
needed. The difference is the potential benefit of running VACUUM
FULL (or CLUSTER) in production (once your DB is bloated, a
normal VACUUM doesn't remove the bloat).

db_production=# VACUUM VERBOSE table;
[...]
INFO: "table": found 408 removable, 64994 nonremovable row versions in 4395 pages

db_restored=# VACUUM VERBOSE table;
[...]
INFO: "table": found 0 removable, 64977 nonremovable row versions in 628 pages

In that 628/4395 example, we have 85% bloat in production.

--
Guillaume Cottenceau

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Thomas Kellerer 2009-11-20 11:21:11 Re: using window-functions to get freshest value - how?
Previous Message hubert depesz lubaczewski 2009-11-20 11:15:35 Re: using window-functions to get freshest value - how?

Browse pgsql-performance by date

  From Date Subject
Next Message Matthew Wakeling 2009-11-20 11:54:58 Re: SSD + RAID
Previous Message Axel Rau 2009-11-20 11:06:05 Re: SSD + RAID