Re: Autovacuum loose ends

From: "Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Autovacuum loose ends
Date: 2005-07-14 16:57:39
Message-ID: 42D69983.4020908@zeut.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

This is great news! I will do what I can to continue improving the code
and address these concerns as best I can. Many of the items below will
need to be addressed by Alvaro, but I will comment where I think I have
something useful to say :-)

Tom Lane wrote:

>I've applied Alvaro's latest integrated-autovacuum patch. There are
>still a number of loose ends to be dealt with before beta, though:
>
>* Not all the functionality of the current contrib code seems to have
>made it in. In particular I noted the "sleep scaling factor" is
>missing, as well as the options to use nondefault vacuum_cost_delay
>settings. (I'm not sure how important the sleep scale factor is,
>but the vacuum cost options seem pretty critical for practical use.)
>There may be other stuff to move over; Matthew or someone more familiar
>than I with the contrib version needs to take a look. (I have refrained
>from removing the contrib module until we're sure we have extracted
>everything from it.)
>
>

I will take a look for missing features, thanks for not removing it from
contrib yet.

As for the sleep factor I'm not sure it makes sense. It was initially
put in as a way to prevent autovacuum from running more than X% of the
time. However, I think the better answer these days is to use the vacuum
delay settings.

Speaking of which, I think I mentioned this to Alvaro, but I guess it
just didn't make it in. The pg_autovacuum table should have a few
additional columns that allow setting vacuum delay settings on a per
table basis. I also think that there should be GUC settings for the
default autovacuum delay settings which an admin might want to be
separate from the system wide default vacuum delay settings.

>* The code does not make a provision to ignore temporary tables.
>Although vacuum.c and analyze.c will disregard the request to touch
>such tables, it'd probably be better to recognize the situation further
>upstream. In particular it seems that autovacuum will continually throw
>ANALYZE requests for a temp table due to lack of stats.
>
>

Does the stats system track data about temp tables? If it doesn't then
autovacuum won't try to vacuum them. Will take a look.

>* ANALYZE also refuses to do anything with pg_statistic itself, which
>is another case that may need special treatment to avoid useless cycles.
>
>

Should be easy enough to tell autovacuum to ignore this table specifically.

>* For that matter I'm unconvinced that it's a good idea to try to force
>the pgstat DB to pick up every table in every database. If there's no
>entry it's because the table is not getting modified, and therefore it
>seems to me that we can just leave well enough alone. The code really
>is not very good about doing nothing where nothing is called for ;-)
>
>

I think in a production environment, this won't be an issue, but in a
development situation where the postmaster is getting stopped and
started fairly often, it could be an issue. Actually, if the stats
system doesn't reset it's data on postmaster restart, this shouldn't be
a problem. Any thoughts on changing this default?

>* The code ignores datallowconn and therefore will periodically vacuum
>template0. I've got mixed emotions about this --- it could save
>someone's bacon if they failed to properly VACUUM FREEZE a template
>database, but in 99.99% of installations it's just wasted cycles.
>Maybe it'd make sense to perform XID-wraparound-prevention vacuuming,
>but not anything more, in a template DB. Thoughts?
>
>

Sounds like a good idea. Bacon conservation is clearly one of the goals
of autovacuum.

>* Or actually, it would vacuum template0, except that since no regular
>backend ever connects to template0, there will be no stats DB entry for
>it and so the loop in AutoVacMain will ignore it. This is definitely
>BAD as it means that a database that's not been touched since postmaster
>start will never be vacuumed, not even for XID wraparound prevention.
>That test needs to be weakened.
>
>* I'm still pretty concerned about the handling of shared catalogs.
>AFAICS the current pgstats infrastructure simply gets this wrong,
>meaning that shared catalogs will not get adequate vacuuming. We need
>to fix that.
>
>

This was handled in the contrib version by only vacuuming shared
catalogs inside template1, however it would then analyze those tables in
each and every database. Is there a reason this solution is not
adequate? Or perhaps this concept doesn't translate to the integrated
version?

>* As Alvaro noted, the default parameter settings need a lookover.
>What is in the patch is not what was the default in the contrib module,
>but the contrib defaults seem awfully passive.
>
>

Alvaro and I talked about this. I suggested these as the new defaults
as there seemed to be a consensus that the defaults in the contrib
version were not very useful for most people. Hopefully these defaults
still a bit conservative, but useful.

>* The documentation badly needs work. I committed some minimal
>additions to runtime.sgml and catalogs.sgml, but the chapter about
>routine maintenance needs a section added about how to use autovac.
>

I promised Alvaro that I would do all the documentation. I will work on
it in the next few days now that the patch has been applied.

Thanks!

Matthew O'Connor

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2005-07-14 17:06:20 Re: Autovacuum loose ends
Previous Message Alvaro Herrera 2005-07-14 16:30:41 Re: Question about convert the binary WAL file

Browse pgsql-patches by date

  From Date Subject
Next Message Alvaro Herrera 2005-07-14 17:06:20 Re: Autovacuum loose ends
Previous Message Titus von Boxberg 2005-07-14 14:54:07 Re: PL/PGSQL: Dynamic Record Introspection