Re: autovacuum next steps, take 2

From: "Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net>
To: "Jim C(dot) Nasby" <jim(at)nasby(dot)net>, "Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net>, Hackers <pgsql-hackers(at)postgresql(dot)org>, Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: autovacuum next steps, take 2
Date: 2007-02-26 21:16:01
Message-ID: 45E34E11.9000702@zeut.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Alvaro Herrera wrote:
> Jim C. Nasby wrote:
>
>> That's why I'm thinking it would be best to keep the maximum size of
>> stuff for the second worker small. It probably also makes sense to tie
>> it to time and not size, since the key factor is that you want it to hit
>> the high-update tables every X number of seconds.
>>
>> If we wanted to get fancy, we could factor in how far over the vacuum
>> threshold a table is, so even if the table is on the larger size, if
>> it's way over the threshold the second vacuum will hit it.
>
> Ok, I think we may be actually getting somewhere.

Me too.

> I propose to have two different algorithms for choosing the tables to
> work on. The worker would behave differently, depending on whether
> there is one or more workers on the database already or not.
>
> The first algorithm is the plain threshold equation stuff we use today.
> If a worker connects and determines that no other worker is in the
> database, it uses the "plain worker" mode. A worker in this mode would
> examine pgstats, determine what tables to vacuum/analyze, sort them by
> size (smaller to larger), and goes about its work. This kind of worker
> can take a long time to vacuum the whole database -- we don't impose any
> time limit or table size limit to what it can do.

Right, I like this.

> The second mode is the "hot table worker" mode, enabled when the worker
> detects that there's already a worker in the database. In this mode,
> the worker is limited to those tables that can be vacuumed in less than
> autovacuum_naptime, so large tables are not considered. Because of
> this, it'll generally not compete with the first mode above -- the
> tables in plain worker were sorted by size, so the small tables were
> among the first vacuumed by the plain worker. The estimated time to
> vacuum may be calculated according to autovacuum_vacuum_delay settings,
> assuming that all pages constitute cache misses.

How can you determine what tables can be vacuumed within
autovacuum_naptime? I agree that large tables should be excluded, but I
don't know how we can do that calculation based on autovacuum_naptime.

So at:
t=0*autovacuume_naptime: worker1 gets started on DBX
t=1*autovacuume_naptime: worker2 gets started on DBX
worker2 determines all tables that need to be vacuumed,
worker2 excludes tables that are too big from it's to-do list,
worker2 gets started working,
worker2 exits when it either:
a) Finishes it's entire to-do-list.
b) Catches up to worker1

I think the questions are 1) What is the exact math you are planning on
using to determine which tables are too big? 2) Do we want worker2 to
exit when it catches worker1 or does the fact that we have excluded
tables that re "too big" mean that we don't have to worry about this?

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Treat 2007-02-26 21:18:20 Re: SCMS question
Previous Message Josh Berkus 2007-02-26 21:11:39 Seeking Google SoC Mentors