Re: scheduling autovacuum at lean hours only.

From: Rajesh Kumar Mallah <mallah(dot)rajesh(at)gmail(dot)com>
To: Chris Browne <cbbrowne(at)acm(dot)org>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: scheduling autovacuum at lean hours only.
Date: 2009-02-14 11:58:32
Message-ID: a97c77030902140358o3e723ea4h14ccdcff747d0da5@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Its nice to know the evolution of autovacuum and i understand that
the suggestion/requirement of "autovacuum at lean hours only"
was defeating the whole idea.

regds
--rajesh kumar mallah.

On Fri, Feb 13, 2009 at 11:07 PM, Chris Browne <cbbrowne(at)acm(dot)org> wrote:
> mallah(dot)rajesh(at)gmail(dot)com (Rajesh Kumar Mallah) writes:
>> why is it not a good idea to give end users control over when they
>> want to run it ?
>
> It's not a particularly good idea to give end users things that they
> are likely then to *immediately* use to shoot themselves in the foot.
>
> Turning off vacuuming "all day" is the sort of thing that is indeed
> pretty certain to hurt you when you imagined it was going to help you.
>
> In particular, if you shut off autovac all day, heavily updated tables
> with certain sorts of (pretty common!) update patterns are certain to
> "bloat up," to the point that you'll need to do CLUSTER/VACUUM FULL on
> them.
>
> In effect, the practical effect of "autovacuum at lean hours only" is
> more reasonably described as "cancel autovacuum and revert to the
> elder policy of requiring users to do manual vacuuming."
>
> It's worth looking at how autovacuum has been evolving over time...
>
> - When it was introduced, 8.0-ish (maybe it was 8.1 when it became
> "official"), it was pretty primitive.
>
> Autovac was a single process, where you had three controls over
> behaviour:
>
> - You could run autovac, or not;
>
> - You could exclude specific tables from being processed by autovac
>
> - There is a capability to vacuum less aggressively by using
> delays to reduce autovac I/O usage
>
> - In 8.3, it was enhanced to add the notion of having multiple vacuum
> workers
>
> There was discussion about having one of those workers restrict
> itself to small tables, so that you'd never have the scenario where
> the workers were all busy and a small table that needed vacuuming
> was left unvacuumed for a long time. It appears that didn't happen,
> which seems unfortunate, but that's life...
>
> You should look at all the "knobs" that *are* offered before deciding
> a policy that may be harmful to performance. As things stand now,
> there are a couple of ways I could see tuning this:
>
> - You might check on the GUC variables autovacuum_vacuum_cost_delay
> and autovacuum_vacuum_cost_limit, which would allow you to restrict
> the I/O cost.
>
> This might allow you to run autovacuum all the time without
> adversely affecting performance.
>
> - You might come up with a list of the *LARGE* tables that you don't
> want vacuumed during the day, and set up a cron job that adds/drops
> them from the pg_autovacuum table at the appropriate times.
>
> This is definitely going to be more finicky, and requires a great
> deal more awareness of the tables being updated by your
> applications. It makes "autovacuum" a whole lot less "automatic."
>
> There are known enhancements coming up:
>
> - In 8.4, there is a capability for VACUUM to only process the
> portions of the table known to have been altered.
>
> That ought to be a better answer than *any* of the fiddling
> suggested, to date. Right now, a VACUUM on "public.my_huge_table",
> a table 18GB in size, will walk through the entire table, even
> though there were only a handful of pages where tuples were
> invalidated.
>
> This is almost certainly the single best improvement possible to
> resolve your issue; it seems likely to *directly* address the
> problem, and has the considerable merit of not requiring much if
> any configuration/reconfiguration/scheduling.
> --
> (reverse (concatenate 'string "gro.mca" "@" "enworbbc"))
> http://linuxdatabases.info/info/
> "what would we do without C? we would have PASAL, BASI, OBOL, and
> Ommon Lisp." -- #Erik
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Peter G. 2009-02-14 16:11:43 Retrieving data from PostgreSQL to .NET application – performance test – surprising results
Previous Message Craig Ringer 2009-02-14 08:49:05 Re: I/O increase after upgrading to 8.3.5