Re: autovacuum not prioritising for-wraparound tables

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Christopher Browne <cbbrowne(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: autovacuum not prioritising for-wraparound tables
Date: 2013-02-01 21:59:52
Message-ID: CA+TgmoZt8HmUjOA2xEzfx5KA5Muh5POUjYfk7=4pgOKLp6xFyw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 31, 2013 at 3:18 PM, Alvaro Herrera
<alvherre(at)2ndquadrant(dot)com> wrote:
> My intention was to apply a Nasby correction to Browne Strength and call
> the resulting function Browne' (Browne prime). Does that sound better?

/me rests head in hands. I'm not halfway clever enough to hang with
this crowd; I'm not even going to touch the puns in Chris' reply.

> Now seriously, I did experiment a bit with this and it seems to behave
> reasonably. Of course, there might be problems with it, and I don't
> oppose to changing the name. "Vacuum strength" didn't sound so great,
> so I picked the first term that came to mind. It's not like picking
> people's last names to name stuff is a completely new idea; that said,
> it was sort of a joke.

I don't think I really understand the origin of the formula, so
perhaps if someone would try to characterize why it seems to behave
reasonably that would be helpful (at least to me).

> f(deadtuples, relpages, age) =
> deadtuples/relpages + e ^ (age*ln(relpages)/2^32)

To maybe make that discussion go more quickly let me kvetch about a
few things to kick things off:

- Using deadtuples/relpages as part of the formula means that tables
with smaller tuples (thus more tuples per page) will tend to get
vacuumed before tables with larger tuples (thus less tuples per page).
I can't immediately see why that's a good thing.

- It's probably important to have a formula where we can be sure that
the wrap-around term will eventually dominate the dead-tuple term,
with enough time to spare to make sure nothing really bad happens; on
the other hand, it's also desirable to avoid the case where a table
that has just crossed the threshold for wraparound vacuuming doesn't
immediately shoot to the top of the list even if it isn't truly
urgent. It's unclear to me just from looking at this formula how well
the second term meets those goals.

- More generally, it seems to me that we ought to be trying to think
about the units in which these various quantities are measured. Each
term ought to be unit-less. So perhaps the first term ought to divide
dead tuples by total tuples, which has the nice property that the
result is a dimensionless quantity that never exceeds 1.0. Then the
second term can be scaled somehow based on that value.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Janes 2013-02-01 22:05:46 Re: autovacuum not prioritising for-wraparound tables
Previous Message Alvaro Herrera 2013-02-01 21:34:08 Re: Re: [PATCH 1/5] Centralize Assert* macros into c.h so its common between backend/frontend