Re: autovacuum not prioritising for-wraparound tables

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Christopher Browne <cbbrowne(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: autovacuum not prioritising for-wraparound tables
Date: 2013-01-25 17:19:25
Message-ID: CA+TgmobbmVWOHRqEDZ8VAq67tLB1J_5wpA9HaWLR81Cb-9n5Hw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 25, 2013 at 11:51 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> The floor(log(size)) part seems like it will have rather arbitrary
> behavioral shifts when a table grows just past a log boundary. Also,
> I'm not exactly sure whether you're proposing smaller tables first or
> bigger tables first, nor that either of those orderings is a good thing.
>
> I think sorting by just age(relfrozenxid) for for-wraparound tables, and
> just the n_dead_tuples measurement for others, is probably reasonable
> for now. If we find out that has bad behaviors then we can look at how
> to fix them, but I don't think we have enough understanding yet of what
> the bad behaviors might be.

Which is exactly why back-patching this is not a good idea, IMHO. We
could easily run across a system where pg_class order happens to be
better than anything else we come up with. Such changes are expected
in new major versions, but not in maintenance releases.

I think that to do this right, we need to consider not only the status
quo but the trajectory. For example, suppose we have two tables to
process, one of which needs a wraparound vacuum and the other one of
which needs dead tuples removed. If the table needing the wraparound
vacuum is small and just barely over the threshold, it isn't urgent;
but if it's large and way over the threshold, it's quite urgent.
Similarly, if the table which needs dead tuples removed is rarely
updated, postponing vacuum is not a big deal, but if it's being
updated like crazy, postponing vacuum is a big problem. Categorically
putting autovacuum wraparound tables ahead of everything else seems
simplistic, and thinking that more dead tuples is more urgent than
fewer dead tuples seems *extremely* simplistic.

I ran across a real-world case where a user had a small table that had
to be vacuumed every 15 seconds to prevent bloat. If we change the
algorithm in a way that gives other things priority over that table,
then that user could easily get hosed when they install a maintenance
release containing this change.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2013-01-25 17:21:00 Re: pg_retainxlog for inclusion in 9.3?
Previous Message Andres Freund 2013-01-25 17:00:51 Re: autovacuum not prioritising for-wraparound tables