Re: autovacuum next steps, take 2

From: "Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Jim C(dot) Nasby" <jim(at)nasby(dot)net>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Hackers <pgsql-hackers(at)postgresql(dot)org>, Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: autovacuum next steps, take 2
Date: 2007-02-27 03:26:49
Message-ID: 45E3A4F9.10700@zeut.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:
> "Jim C. Nasby" <jim(at)nasby(dot)net> writes:
>> The real problem is trying to set that up in such a fashion that keeps
>> hot tables frequently vacuumed;
>
> Are we assuming that no single worker instance will vacuum a given table
> more than once? (That's not a necessary assumption, certainly, but
> without it there are so many degrees of freedom that I'm not sure how
> it should act.) Given that assumption, the maximum vacuuming rate for
> any table is once per autovacuum_naptime, and most of the magic lies in
> the launcher's algorithm for deciding which databases to launch workers
> into.

Yes, I have been working under the assumption that a worker goes through
the list of tables once and exits, and yes the maximum vacuuming rate
for any table would be once per autovacuum_naptime. We can lower the
default if necessary, as far as I'm concerned it's (or should be) fairly
cheap to fire off a worker and have it find that there isn't anything
todo and exit.

> I'm inclined to propose an even simpler algorithm in which every worker
> acts alike; its behavior is
> 1. On startup, generate a to-do list of tables to process, sorted in
> priority order.
> 2. For each table in the list, if the table is still around and has not
> been vacuumed by someone else since you started (including the case of
> a vacuum-in-progress), then vacuum it.

That is what I'm proposing except for one difference, when you catch up
to an older worker, exit. This has the benefit reducing the number of
workers concurrently working on big tables, which I think is a good thing.

> Detecting "already vacuumed since you started" is a bit tricky; you
> can't really rely on the stats collector since its info isn't very
> up-to-date. That's why I was thinking of exposing the to-do lists
> explicitly; comparing those with an advertised current-table would
> allow accurate determination of what had just gotten done.

Sounds good, but I have very little insight into how we would implement
"already vacuumed since you started" or "have I caught up to another
worker".

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Matthew T. O'Connor 2007-02-27 03:32:06 Re: autovacuum next steps, take 2
Previous Message Tom Lane 2007-02-27 03:26:40 Re: autovacuum next steps, take 2