Skip site navigation (1) Skip section navigation (2)

Re: Random performance hit, unknown cause.

From: Brian Fehrle <brianf(at)consistentstate(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Claudio Freire <klaussfreire(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Random performance hit, unknown cause.
Date: 2012-04-12 21:31:21
Message-ID: 4F8749A9.9080309@consistentstate.com (view raw or flat)
Thread:
Lists: pgsql-performance
Interesting, that is very likely.

In this system I have a table that is extremely active. On a 'normal' 
day, the autovacuum process takes about 7 hours to complete on this 
table, and once it's complete, the system performs an autoanalyze on the 
table, finding that we have millions of new dead rows. Once this 
happens, it kicks off the autovacuum again, so we basically always have 
a vacuum running on this table at any given time.

If I were to tweak the autovacuum_vacuum_cost_delay parameter, what 
would that be doing? Would it be limiting what the current autovacuum is 
allowed to do? Or does it simply space out the time between autovacuum 
runs? In my case, with 7 hour long autovacuums (sometimes 14 hours), a 
few milliseconds between each vacuum wouldn't mean anything to me.

If that parameter does limit the amount of work autovacuum can do, It 
may cause the system to perform better at that time, but would prolong 
the length of the autovacuum right? That's an issue I'm already having 
issue with, and wouldn't want to make the autovacuum any longer if I 
don't need to.

- Brian F


On 04/12/2012 01:52 PM, Kevin Grittner wrote:
> Claudio Freire<klaussfreire(at)gmail(dot)com>  wrote:
>> On Thu, Apr 12, 2012 at 3:41 PM, Brian Fehrle
>> <brianf(at)consistentstate(dot)com>  wrote:
>>> This morning, during our nightly backup process (where we grab a
>>> copy of the data directory), we started having this same issue.
>>> The main thing that I see in all of these is a high disk wait on
>>> the system. When we are performing 'well', the %wa from top is
>>> usually around 30%, and our load is around 12 - 15. This morning
>>> we saw a load  21 - 23, and an %wa jumping between 60% and 75%.
>>>
>>> The top process pretty much at all times is the WAL Sender
>>> Process, is this normal?
>> Sounds like vacuum to me.
>
> More particularly, it seems consistent with autovacuum finding a
> large number of tuples which had reached their freeze threshold.
> Rewriting the tuple in place with a frozen xmin is a WAL-logged
> operation.
>
> -Kevin


In response to

Responses

pgsql-performance by date

Next:From: Kevin GrittnerDate: 2012-04-12 22:00:13
Subject: Re: Random performance hit, unknown cause.
Previous:From: Claudio FreireDate: 2012-04-12 19:59:10
Subject: Re: Random performance hit, unknown cause.

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group