Re: Why is INSERT-driven autovacuuming based on pg_class.reltuples?

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Why is INSERT-driven autovacuuming based on pg_class.reltuples?
Date: 2022-02-01 20:02:38
Message-ID: CAApHDvqaEpwzchs5=k4a38WTySfqdqF2WZCqW8Kz4R00ScSFwg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 31 Jan 2022 at 17:28, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> If ANALYZE runs and sets pg_class.reltuples to 1 million, then we
> insert 500k tuples, assuming a 0 vacuum_ins_threshold and a
> vacuum_ins_scale_factor of 0.2, then we'll want to perform a vacuum as
> "vac_ins_base_thresh + vac_ins_scale_factor * reltuples" will come out
> at 200k. auto-vacuum will then trigger and update reltuples hopefully
> to some value around 1.5 million, then next time it'll take 300k
> tuples to trigger an insert vacuum.

If we wanted a more current estimate for the number of tuples in a
relation then we could use reltuples / relpages *
RelationGetNumberOfBlocks(r). However, I still don't see why an
INSERT driven auto-vacuums are a particularly special case. ANALYZE
updating the reltuples estimate had an effect on when auto-vacuum
would trigger for tables that generally grow in the number of live
tuples but previously only (i.e before insert vacuums existed)
received auto-vacuum attention due to UPDATEs/DELETEs.

I suppose the question is, what is autovacuum_vacuum_scale_factor
meant to represent? Our documents claim:

> Specifies a fraction of the table size to add to autovacuum_vacuum_threshold when deciding whether to trigger a VACUUM. The default is 0.2 (20% of table size). This parameter can only be set in the postgresql.conf file or on the server command line; but the setting can be overridden for individual tables by changing table storage parameters.

Nothing there seems to indicate the scale is based on the historical
table size when the table was last vacuumed/analyzed, so you could
claim that the 3 usages of relpages when deciding if the table should
be vacuumed and/or analyzed are all wrong and should take into account
RelationGetNumberOfBlocks too.

I'm not planning on doing anything to change any of this unless I see
some compelling argument that what's there is wrong.

David

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-02-01 20:02:44 Re: Latest LLVM breaks our code again
Previous Message Pavel Stehule 2022-02-01 19:35:00 Re: Deparsing rewritten query