Re: scheduling autovacuum at lean hours only.

From: Ben Chobot <bench(at)silentmedia(dot)com>
To: Scott Carey <scott(at)richrelevance(dot)com>
Cc: Rajesh Kumar Mallah <mallah(dot)rajesh(at)gmail(dot)com>, PostgreSQL Performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: scheduling autovacuum at lean hours only.
Date: 2009-02-11 17:57:43
Message-ID: Pine.LNX.4.64.0902110955110.10022@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Wed, 11 Feb 2009, Scott Carey wrote:

> On a large partitioned database, ordinary vacuum is a very very difficult option.
>
> Most of the time on such a system, most tables are dormant with respect to writes and never need to be vacuumed. A 'daily vacuum' is likely to take a full day to run on larger systems. Since ordinary vacuum can't be run on subsets of tables without explicitly naming them one at a time (can't just vacuum a schema, tablespace, or use a regex to match table names), good luck using it effectively if you have a few thousand tables in partitions. You'll have to have application code or a script with knowledge of all the partition names and which are in need of an analyze/vacuum.
>
> Autovacuum is good enough in recent versions to be tuned to have very low impact though. If you have to, rather than stop and start it, just turn the delay or cost settings up and down during different parts of the day. More than likely however, it will be able to keep up with a single set of settings.
> In particular, rather than making the delay longer, make the appropriate cost larger -- page miss or page dirty affect how much I/O it will do, and page hit will mostly affect how much CPU it uses.
>
> Perhaps a feature request is to have a version of the manual vacuum command that doesn't bother running on tables that autovacuum wouldn't touch due to insufficient data churn. This way, at lean hours one can manually vacuum to help an autovacuum that was tuned for very low impact 'catch up'.
> Also, if there was some way to make vacuum not stink so badly on tables that were just loaded with pg_load, where it causes huge write activity for tables that clearly have no bloat (I believe this is a hint bits thing?).

Oh, I agree with everything you say. I'm just pointing out that if you
really do want control over when things get vacuumed (e.g. you have a
mostly-read database 20 hours a day and then 4 hours of heavy churn at
night) then you can still do that if you want.

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Brad Nicholson 2009-02-11 18:00:03 Re: scheduling autovacuum at lean hours only.
Previous Message Rajesh Kumar Mallah 2009-02-11 17:27:27 Re: scheduling autovacuum at lean hours only.