Re: autovacuum not prioritising for-wraparound tables

From: Christopher Browne <cbbrowne(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: autovacuum not prioritising for-wraparound tables
Date: 2013-02-01 22:43:04
Message-ID: CAFNqd5Vwf2TWUHNFwUy25z6HnSOTJq59b60=zR+KsV+qWXAJ_g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Feb 1, 2013 at 4:59 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Thu, Jan 31, 2013 at 3:18 PM, Alvaro Herrera
> <alvherre(at)2ndquadrant(dot)com> wrote:
>> My intention was to apply a Nasby correction to Browne Strength and call
>> the resulting function Browne' (Browne prime). Does that sound better?
>
> /me rests head in hands. I'm not halfway clever enough to hang with
> this crowd; I'm not even going to touch the puns in Chris' reply.

It's Friday... Fun needs to be had :-).

>> Now seriously, I did experiment a bit with this and it seems to behave
>> reasonably. Of course, there might be problems with it, and I don't
>> oppose to changing the name. "Vacuum strength" didn't sound so great,
>> so I picked the first term that came to mind. It's not like picking
>> people's last names to name stuff is a completely new idea; that said,
>> it was sort of a joke.
>
> I don't think I really understand the origin of the formula, so
> perhaps if someone would try to characterize why it seems to behave
> reasonably that would be helpful (at least to me).
>
>> f(deadtuples, relpages, age) =
>> deadtuples/relpages + e ^ (age*ln(relpages)/2^32)
>
> To maybe make that discussion go more quickly let me kvetch about a
> few things to kick things off:
>
> - Using deadtuples/relpages as part of the formula means that tables
> with smaller tuples (thus more tuples per page) will tend to get
> vacuumed before tables with larger tuples (thus less tuples per page).
> I can't immediately see why that's a good thing.

That wasn't intentional, and may be somewhat unfortunate.

I picked values that I knew could be easily grabbed, and we don't
have an immediate tuples-per-page estimate on pg_class. An
estimate should be available in pg_statistic; I'm not sure that the
bias from this hurts things badly.

> - It's probably important to have a formula where we can be sure that
> the wrap-around term will eventually dominate the dead-tuple term,
> with enough time to spare to make sure nothing really bad happens; on
> the other hand, it's also desirable to avoid the case where a table
> that has just crossed the threshold for wraparound vacuuming doesn't
> immediately shoot to the top of the list even if it isn't truly
> urgent. It's unclear to me just from looking at this formula how well
> the second term meets those goals.

I think the second term *does* provide a way for wraparound to dominate;
splitting it apart a bit...

Consider...

age * ln(relpages)
e^ ----------------------------------
2^32

The wraparound portion of this involves age/2^32... In the beginning, the
numerator will be near zero, and denominator near 2 billion, so is roughly
1. As age trends towards 2^32, the fraction (ignoring ln(relpages)) trends
towards 1, so that the longer we go without vacuuming, the more certain
that the fraction indicates a value near 1. That *tends* to give you something
looking like e^1, or 2.71828+, ignoring the relpages part.

I threw in multiplying by ln(relpages) as a way to step Well Back from
rollover; that means that this term will start growing considerably before
rollover, and the larger the table, the sooner that growth takes place.

There is a problem with the ln(relpages) term; if the table has just 1 page,
the ln(relpages) = 0 so the value of the exponential term is *always* 1.
Probably should have ln(relpages+CONSTANT) so that we guarantee
the numerator is never 0.

I'm a bit worried that the exponential term might dominate *too* quickly.

For a table I have handy with 163K tuples, spread across 3357 pages,
ln(relpage) = 8.1188, and the range of the "exponential bit" travels
like follows:

dotpro0620(at)localhost-> select generate_series(1,20)*100/20 as
percent_wraparound, power(2.71828,
(65536.0*32768.0*generate_series(1,20)/20.0 *
ln(3357))/(65536.0*32768)) as wraparound_term;
percent_wraparound | wraparound_term
--------------------+------------------
5 | 1.50071232210687
10 | 2.2521374737234
15 | 3.37981045789535
20 | 5.07212320054923
25 | 7.61179778630838
30 | 11.4231187312988
35 | 17.1428150369499
40 | 25.7264337615498
45 | 38.607976149824
50 | 57.9394655396491
55 | 86.950469871638
60 | 130.48764154935
65 | 195.824411555774
70 | 293.876107391077
75 | 441.023495534592
80 | 661.849394087408
85 | 993.24554108594
90 | 1490.57582238538
95 | 2236.92550368832
100 | 3356.98166702019
(20 rows)

At the beginning, the "wraparound" portion is just 1.5, so easily dominated by
a table with a lot of dead tuples. As the time to wraparound declines,
the term becomes steadily more urgent. There may be constants
factors to fiddle with at the edges, but this term definitely heads towards
dominance.

That's definitely doing what I intended, and after constructing that table,
I think I'm *more* confident. Consider that if there are two tables of
different sizes, both head towards "maxing out" at a "wraparound_term"
value that is directly correlated with the size of each table, which seems
mighty right. A bigger table needs to get drawn into play (e.g. - needs
to get vacuumed) earlier than a smaller one.

> - More generally, it seems to me that we ought to be trying to think
> about the units in which these various quantities are measured. Each
> term ought to be unit-less. So perhaps the first term ought to divide
> dead tuples by total tuples, which has the nice property that the
> result is a dimensionless quantity that never exceeds 1.0. Then the
> second term can be scaled somehow based on that value.

Absolutely a good idea. I'm not sure I agree it ought to wind up
unitless; I'd instead expect a common unit, perhaps number of pages,
indicating a surrogate for quantity of I/O. That we're both thinking
about "what's the unit?" means we're on a compatible trail.
--
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2013-02-01 22:56:05 Re: proposal - assign result of query to psql variable
Previous Message Daniel Farina 2013-02-01 22:39:07 Re: json api WIP patch