Why not represent "never vacuumed" accurately wrt pg_class.relpages?

From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Why not represent "never vacuumed" accurately wrt pg_class.relpages?
Date: 2018-12-11 07:20:05
Message-ID: 20181211072005.6f4frwg72oqe6uqi@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

estimate_rel_size() explains:

/*
* HACK: if the relation has never yet been vacuumed, use a
* minimum size estimate of 10 pages. The idea here is to avoid
* assuming a newly-created table is really small, even if it
* currently is, because that may not be true once some data gets
* loaded into it. Once a vacuum or analyze cycle has been done
* on it, it's more reasonable to believe the size is somewhat
* stable.
*
* (Note that this is only an issue if the plan gets cached and
* used again after the table has been filled. What we're trying
* to avoid is using a nestloop-type plan on a table that has
* grown substantially since the plan was made. Normally,
* autovacuum/autoanalyze will occur once enough inserts have
* happened and cause cached-plan invalidation; but that doesn't
* happen instantaneously, and it won't happen at all for cases
* such as temporary tables.)
*
* We approximate "never vacuumed" by "has relpages = 0", which
* means this will also fire on genuinely empty relations. Not
* great, but fortunately that's a seldom-seen case in the real
* world, and it shouldn't degrade the quality of the plan too
* much anyway to err in this direction.
*
* There are two exceptions wherein we don't apply this heuristic.
* One is if the table has inheritance children. Totally empty
* parent tables are quite common, so we should be willing to
* believe that they are empty. Also, we don't apply the 10-page
* minimum to indexes.
*/

I don't quite get why we don't instead just represent "never vacuumed"
by storing a more meaningful value in relpages? We could go for
InvalidBlockNumber, or even NULL (although the latter would be a bit
annoying due to not being mappable to a struct anymore).

I've seen numerous cases where relpages = 0 -> never vacuumed has caused
worse plans, and it just doesn't seem necessary?

Greetings,

Andres Freund

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2018-12-11 08:22:48 Re: [HACKERS] Can ICU be used for a database's default sort order?
Previous Message Andrey Borodin 2018-12-11 07:14:50 Re: Connections hang indefinitely while taking a gin index's LWLock buffer_content lock