Re: [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Florian Helmberger <fh(at)25th-floor(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum
Date: 2011-05-25 16:00:38
Message-ID: 1306339028-sup-6948@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin pgsql-hackers

Excerpts from Tom Lane's message of mié may 25 11:47:52 -0400 2011:

> I think I see what must be going on here: that toast table must contain
> a long run of all-visible-according-to-the-VM pages (which is hardly a
> surprising situation). This results in VACUUM choosing not to update
> the pg_class entry:
>
> /*
> * Update statistics in pg_class. But only if we didn't skip any pages;
> * the tuple count only includes tuples from the pages we've visited, and
> * we haven't frozen tuples in unvisited pages either. The page count is
> * accurate in any case, but because we use the reltuples / relpages ratio
> * in the planner, it's better to not update relpages either if we can't
> * update reltuples.
> */
> if (vacrelstats->scanned_all)
> vac_update_relstats(onerel,
> vacrelstats->rel_pages, vacrelstats->rel_tuples,
> vacrelstats->hasindex,
> FreezeLimit);
>
> For an ordinary table this wouldn't be fatal because we'll still do an
> ANALYZE from time to time, and that will update the entries with new
> (approximate) values. But we never run ANALYZE on toast tables.

Ouch.

> I can see two basic approaches we might take here:
>
> 1. Modify autovacuum to use something from the stats collector, rather
> than reltuples, to make its decisions. I'm not too clear on why
> reltuples is being used there anyway; is there some good algorithmic or
> statistical reason why AV should be looking at a number from the last
> vacuum?

It uses reltuples simply because it was what the original contrib code
was using. Since pgstat was considerably weaker at the time, reltuples
might have been the only thing available. It's certainly the case that
pgstat has improved a lot since autovacuum got in, and some things have
been revised but not this one.

> 2. Revise the vacuum code so that it doesn't skip updating the pg_class
> entries. We could have it count the number of pages it skipped, rather
> than just keeping a bool, and then scale up the rel_tuples count to be
> approximately right by assuming the skipped pages have tuple density
> similar to the scanned ones.

Hmm, interesting idea. This would be done only for toast tables, or all
tables?

At this point I only wonder why we store tuples & pages rather than just
live tuple density.

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Robert Haas 2011-05-25 16:17:11 Re: [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum
Previous Message Tom Lane 2011-05-25 15:49:57 Re: pg_class reltuples/relpages not updated by autovacuum/vacuum

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-05-25 16:12:50 Re: tackling full page writes
Previous Message Robert Haas 2011-05-25 15:51:51 Re: Volunteering as Commitfest Manager