Re: autovacuum not prioritising for-wraparound tables

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Christopher Browne <cbbrowne(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: autovacuum not prioritising for-wraparound tables
Date: 2013-03-21 22:15:47
Message-ID: 20130321221547.GD3685@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Here's another cut at this patch. This is mainly about the
infrastructure to pass the data around in autovacuum; the proposed
formulas probably need lot of work.

We still have two terms in autovacuum priority, the first one considers
dead tuples and the second one considers wraparound limit. I have kept
Chris' proposal for the second term, but refined the first one a bit per
Jim Nasby's suggestion of discounting dead space. So we now have

return
(d->p_deadtuples * (d->p_livetuples + d->p_deadtuples) * d->p_relpages) /
((d->p_livetuples + 1) * d->p_reltuples * nblocks) +
exp(d->p_xidage * logf(nblocks) / UINT_MAX);

Here, deadtuples and livetuples come from pgstat data, while relpages
and reltuples come from pg_class. nblocks, on the other hand, comes
from the actual number of blocks in the table.

I haven't considered the case where pg_class.reltuples = 0 (which
results in division-by-zero), but I think to be really robust here we'd
want to have some code copied from estimate_rel_size; or maybe simply
use some hardcoded magic value. I lean towards the latter, because I'm
not sure we want to expend a relation open at this point (incurring an
attempt to lock the table, which could be problematic); hence the new
RelFileNodeGetNumberOfBlocks() thingy, which is admittedly pretty ugly,
not to mention untested.

(I have considered livetuples=0 however, hence the +1 there).

I think we now need to have a more focused discussion on useful
formulas to use here. One thing I noticed that fails in the above
formula is that as nblocks grows, ceteris paribus, the score falls; but
that's wrong, because if you have a table that turns out to have much
larger nblocks because it bloated and pgstat lost the message, we need
to look harder at it. So somehow we need to consider the tuple density
as given by pg_class.reltuples/pg_class.relpages, and compare with the
one given by pgstat.(live+dead) / nblocks; and raise the score as the
ratio goes down (in normal conditions the ratio should be 1; a bloated
table that pgstat hasn't noticed will have a lower ratio).

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
autovacuum-toast-wraparound-3.patch text/x-diff 16.7 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2013-03-21 22:38:36 Re: Support for REINDEX CONCURRENTLY
Previous Message Michael Paquier 2013-03-21 22:12:45 Re: SIGHUP not received by custom bgworkers if postmaster is notified