Re: Autovacuum Improvements

From: Christopher Browne <cbbrowne(at)acm(dot)org>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Autovacuum Improvements
Date: 2006-12-24 02:03:11
Message-ID: 87wt4ijbj4.fsf@wolfe.cbbrowne.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

After takin a swig o' Arrakan spice grog, nagy(at)ecircle-ag(dot)com (Csaba Nagy) belched out:
> On Thu, 2006-12-21 at 18:41, Alvaro Herrera wrote:
>> > From all the discussion here I think the most benefit would result from
>> > a means to assign tables to different categories, and set up separate
>> > autovacuum rules per category (be it time window when vacuuming is
>> > allowed, autovacuum processes assigned, cost settings, etc). I doubt you
>> > can really define upfront all the vacuum strategies you would need in
>> > real life, so why not let the user define it ? Define the categories by
>> > assigning tables to them, and the rules per category. Then you can
>> > decide what rules to implement, and what should be the defaults...
>>
>> Hmm, yeah, I think this is more or less what I have in mind.
>
> Cool :-)
>
> Can I suggest to also consider the idea of some kind of autovacuum
> process group, with settings like:
>
> - number of processes running in parallel;
> - time windows when they are allowed to run;
>
> Then have the table categories with all the rest of the
> threshold/cost/delay settings.
>
> Then have the possibility to assign tables to categories, and to assign
> categories to processing groups.
>
> I think this would allow the most flexibility with the minimum of
> repetition in settings (from the user perspective).

Seems to me that you could get ~80% of the way by having the simplest
"2 queue" implementation, where tables with size < some threshold get
thrown at the "little table" queue, and tables above that size go to
the "big table" queue.

That should keep any small tables from getting "vacuum-starved."

I'd think the next step would be to increase the number of queues,
perhaps in a time-based fashion. There might be times when it's
acceptable to vacuum 5 tables at once, so you burn thru little tables
"like the blazes," and handle larger ones fairly promptly. And other
times when you don't want to do *any* big tables, and limit a single
queue to just the itty bitty ones.

This approach allows you to stay mostly heuristic-based, as opposed to
having to describe policies in gratuitous detail.

Having a mechanism that requires enormous DBA effort and where there
is considerable risk of simple configuration errors that will be hard
to notice may not be the best kind of "feature" :-).
--
let name="cbbrowne" and tld="gmail.com" in name ^ "@" ^ tld;;
http://linuxdatabases.info/info/slony.html
"You can measure a programmer's perspective by noting his attitude on
the continuing vitality of FORTRAN." -- Alan Perlis

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Christopher Browne 2006-12-24 02:23:14 Re: Clustering & Load Balancing & Replication
Previous Message Tom Lane 2006-12-24 01:59:25 Re: tape backups

Browse pgsql-hackers by date

  From Date Subject
Next Message tomas 2006-12-24 05:21:03 Re: quick review
Previous Message Tom Lane 2006-12-24 01:44:59 Loose ends in PG XML patch