Re: [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Florian Helmberger <fh(at)25th-floor(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Greg Stark <gsstark(at)mit(dot)edu>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum
Date: 2011-05-27 15:07:05
Message-ID: BANLkTinREie2cX2zqZPSEfUsZsdM+GjymQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin pgsql-hackers

On Thu, May 26, 2011 at 5:50 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
>> When we prune or vacuum a page, I don't suppose we have enough
>> information about that page's previous state to calculate a tuple
>> count delta, do we?  That would allow a far more accurate number to
>> be maintained than anything suggested so far, as long as we tweak
>> autovacuum to count inserts toward the need to vacuum.
>
> Well, that was the other direction that was suggested upthread: stop
> relying on reltuples at all, but use the stats collector's counts.
> That might be a good solution in the long run, but there are some
> issues:
>
> 1. It's not clear how using a current count, as opposed to
> time-of-last-vacuum count, would affect the behavior of the autovacuum
> control logic.  At first glance I think it would break it, since the
> basic logic there is "how much of the table changed since it was last
> vacuumed?".  Even if the equations could be modified to still work,
> I remember enough feedback control theory from undergrad EE to think that
> this is something to be seriously scared of tweaking without extensive
> testing.  IMO it is far more risky than what Robert is worried about.

Yeah, I think that would be broken.

> 2. You still have the problem that we're exposing inaccurate (or at
> least less accurate than they could be) counts to the planner and to
> onlooker clients.  We could change the planner to also depend on the
> stats collector instead of reltuples, but at that point you just removed
> the option for people to turn off the stats collector.  The implications
> for plan stability might be unpleasant, too.
>
> So that's not a direction I want to go without a significant amount
> of work and testing.

FWIW, I agree. Your proposed solution is certainly better than trying
to do this; but it still seems a bit shaky to me.

Still, maybe we don't have a better option. If it were me, I'd add an
additional safety valve: use your formula if the percentage of the
relation scanned is above some threshold where there's unlikely to be
too much skew. But if the percentage scanned is too small, then don't
use that formula. Instead, only update relpages/reltuples if the
relation is now larger; set relpages to the new actual value, and
scale up reltuples proportionately.

However, I just work here. It's possible that I'm worrying about a
problem that won't materialize in practice.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Tom Lane 2011-05-27 18:34:10 Re: [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum
Previous Message Tom Lane 2011-05-26 21:50:00 Re: [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavan Deolasee 2011-05-27 15:07:09 Re: Vacuum, visibility maps and SKIP_PAGES_THRESHOLD
Previous Message Cédric Villemain 2011-05-27 14:38:22 Re: Vacuum, visibility maps and SKIP_PAGES_THRESHOLD