Re: First steps with 8.3 and autovacuum launcher

From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
Cc: "Michael Paesold" <mpaesold(at)gmx(dot)at>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Gregory Stark" <stark(at)enterprisedb(dot)com>, "Guillaume Smet" <guillaume(dot)smet(at)gmail(dot)com>, "Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net>, "Stefan Kaltenbrunner" <stefan(at)kaltenbrunner(dot)cc>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: First steps with 8.3 and autovacuum launcher
Date: 2007-10-10 10:17:43
Message-ID: 470CA6C7.7070504@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Simon Riggs wrote:
> My thoughts are that it doesn't need to. Typically we create objects and
> then fill them. It isn't that frequent that we would load data, then
> delete or update more than 20% of it, then attempt other DDL.

One scenario that comes to mind is a table that's used in OLTP fashion
during day, but it's taken offline for data loading during night. To
speed up the data loading, indexes are dropped before the load and
recreated afterwards.

Even if there's no dead rows in a table, autovacuum will still kick in
to freeze it at some point.

> If a COPY fails it will create dead rows, which should be cleared up by
> an autoVACUUM. If a COPY fails, the user knows to run a VACUUM or a
> re-TRUNCATE before re-attempting a modified COPY. So there is potential
> for more than one VACUUM to be attempted in that case.

I wish the user didn't have to know to do that.

> So there could be an argument for TRUNCATE causing a cancellation of a
> VACUUM, but I don't see the use case for other DDL. Maybe it would be
> easier to make all conflicting lock requestors cancel VACUUM.

Any VACUUM, or just autovacuum?

The only danger I can see is that the autovacuum is always killed and
never gets to finish, leading to degrading performance at first and
shutdown to prevent xid wraparound at the extreme. Doesn't seem likely
under normal circumstances, though. A scenario that comes to mind is
having very lazy autovacuum settings, so that vacuum of the table takes
longer than 24h, and a daily cron job to run REINDEX.

The "priority inheritance" scheme I proposed earlier would work well
with that: instead of killing the autovacuum, set cost delay to zero to
let it finish out of the way ASAP. It has it's own set of problems,
though. An innocent-looking DROP INDEX would cause the autovacuum to go
full steam ahead, hurting performance for others.

> I think it would be helpful if user-initiated VACUUMs waited behind
> another VACUUM that was already in progress on the table and then
> returned immediately as successful when the first VACUUM finishes. That
> would seem better than queuing up behind the first VACUUM and then
> repeating the process.

I don't think that's a good idea. The second VACUUM wouldn't be a no-op,
it would clean up any dead rows accumulated during the first VACUUM.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2007-10-10 10:25:02 Re: Skytools committed without hackers discussion/review
Previous Message Khan, Mahmood Ahram 2007-10-10 10:11:45 pgstattuple module