Re: scheduling autovacuum at lean hours only.

From: Scott Carey <scott(at)richrelevance(dot)com>
To: Ben <bench(at)silentmedia(dot)com>, Rajesh Kumar Mallah <mallah(dot)rajesh(at)gmail(dot)com>
Cc: PostgreSQL Performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: scheduling autovacuum at lean hours only.
Date: 2009-02-11 17:23:25
Message-ID: BDFBB77C9E07BE4A984DAAE981D19F961ADD4C5209@EXVMBX018-1.exch018.msoutlookonline.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On a large partitioned database, ordinary vacuum is a very very difficult option.

Most of the time on such a system, most tables are dormant with respect to writes and never need to be vacuumed. A 'daily vacuum' is likely to take a full day to run on larger systems. Since ordinary vacuum can't be run on subsets of tables without explicitly naming them one at a time (can't just vacuum a schema, tablespace, or use a regex to match table names), good luck using it effectively if you have a few thousand tables in partitions. You'll have to have application code or a script with knowledge of all the partition names and which are in need of an analyze/vacuum.

Autovacuum is good enough in recent versions to be tuned to have very low impact though. If you have to, rather than stop and start it, just turn the delay or cost settings up and down during different parts of the day. More than likely however, it will be able to keep up with a single set of settings.
In particular, rather than making the delay longer, make the appropriate cost larger -- page miss or page dirty affect how much I/O it will do, and page hit will mostly affect how much CPU it uses.

Perhaps a feature request is to have a version of the manual vacuum command that doesn't bother running on tables that autovacuum wouldn't touch due to insufficient data churn. This way, at lean hours one can manually vacuum to help an autovacuum that was tuned for very low impact 'catch up'.
Also, if there was some way to make vacuum not stink so badly on tables that were just loaded with pg_load, where it causes huge write activity for tables that clearly have no bloat (I believe this is a hint bits thing?).
________________________________________
From: pgsql-performance-owner(at)postgresql(dot)org [pgsql-performance-owner(at)postgresql(dot)org] On Behalf Of Ben [bench(at)silentmedia(dot)com]
Sent: Wednesday, February 11, 2009 8:46 AM
To: Rajesh Kumar Mallah
Cc: PostgreSQL Performance
Subject: Re: [PERFORM] scheduling autovacuum at lean hours only.

On Feb 11, 2009, at 6:57 AM, Rajesh Kumar Mallah wrote:

> why is it not a good idea to give end users control over when they
> want to run it ?

There's nothing stopping you from just turning off autovacuum and
running vacuum manually. The point of autovacuum is to vacuum "as
needed."

--
Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Rajesh Kumar Mallah 2009-02-11 17:27:27 Re: scheduling autovacuum at lean hours only.
Previous Message Ben 2009-02-11 16:46:54 Re: scheduling autovacuum at lean hours only.