Re: Autovacuum integration

From: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: Autovacuum integration
Date: 2005-07-12 23:34:55
Message-ID: 20050712233455.GB15464@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

On Fri, Jul 08, 2005 at 03:56:25PM -0400, Tom Lane wrote:
> Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> writes:
> > Here is a second attempt at autovacuum integration.
>
> A few comments:

Ok, here is an updated patch. Hopefully I have covered most your
more important observations. Particularly, I changed the shutdown
sequence per your comments, and the pg_autovacuum tuple is optional.

Additional comments:

> * I see you have an autovac_init function to "annoy the user", but
> shouldn't this be checked every time we are about to spawn an autovac
> process?

I didn't do anything about this (i.e. it only happens once). Note that
if we annoy the user because of this, the autovacuum process is
disabled "forever."

> * I don't see any special checks for shared catalogs, which means they
> are probably going to be over-vacuumed; or possibly under-vacuumed if
> you fail to track the update stats for them in a single place rather
> than in each database.

I'm still not doing anything special about shared relations. I think it
would be easy to treat them in a special way.

> * I have no objection to adding extra entry points to vacuum.c to
> simplify the calls to it.

I didn't do it, because it uglified the code. Rather, I added a relid
member to VacuumStmt.

> If ANALYZE needs to send something to the stats system, make it do
> so.

It does, as does VACUUM. I still think we should do something special
on TRUNCATE, maybe send a special message.

Note that I keep track of dead tuples directly in the stats for each
table, rather than keeping track of "last vacuum tuples", which was a
strange concept anyway. It came out being much simpler this way. The
only consideration is that it makes the vacuum case different from
analyze, but I don't see that as a problem.

Also, there are tables for which analyze refuses to run. I'm not sure
what to do about them. The problem is that since ANALYZE doesn't run to
completion, it doesn't emit the stat message, so we try to analyze it
the next time around. I considered sending special messages to the
stats, telling not to analyze the table in the future (vacuum still
works as expected). However I don't see how would we re-enable the
auto-analyze feature in case something happens to the table. There are
two cases: pg_statistics is one, and the other is tables that don't have
any analyzable columns (There is at least one table of this kind in the
regression test, comprising one "box" column.) This may turn out not to
be a problem, since ANALYZE will return very quickly in this case, but
it annoys me anyway.

Finally: I didn't do anything special about TOAST tables yet. I think
this is a separate problem.

--
Alvaro Herrera (<alvherre[a]alvh.no-ip.org>)
Thou shalt study thy libraries and strive not to reinvent them without
cause, that thy code may be short and readable and thy days pleasant
and productive. (7th Commandment for C Programmers)

Attachment Content-Type Size
autovacuum-6.patch text/plain 73.3 KB

In response to

Browse pgsql-patches by date

  From Date Subject
Next Message Michael Fuhr 2005-07-12 23:59:18 Re: PL/Perl list value return causes segfault
Previous Message Andrew Dunstan 2005-07-12 22:41:53 Re: PL/Perl list value return causes segfault