From: | Josh Berkus <josh(at)agliodbs(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Rick <richard(dot)branton(at)ca(dot)com>, pgsql-performance(at)postgresql(dot)org |
Subject: | Re: autovacuum strategy / parameters |
Date: | 2010-04-30 22:50:54 |
Message-ID: | 4BDB5ECE.4010607@agliodbs.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
> My guess is that the reason we run ANALYZE more frequently than vacuum
> (with the default settings) is that ANALYZE is pretty cheap. In many
> cases, if the statistical distribution of the data hasn't changed
> much, then it's not really necessary, but it doesn't cost much either.
> And for certain types of usage patterns, like time series (where the
> maximum value keeps increasing) it's REALLY important to analyze
> frequently.
>
> But having said that, on the systems I've worked with, I've only
> rarely seen a problem caused by not analyzing frequently enough. On
> the other hand, I've seen MANY problems caused by not vacuuming
> enough.
Which is the opposite of my experience; currently we have several
clients who have issues which required more-frequent analyzes on
specific tables. Before 8.4, vacuuming more frequently, especially on
large tables, was very costly; vacuum takes a lot of I/O and CPU. Even
with 8.4 it's not something you want to increase without thinking about
the tradeoffs.
Since I'm responsible for the current defaults, I though I'd explain the
reasoning behind them. I developed and tested them while at Greenplum,
so they are *not* designed for small databases.
#autovacuum_vacuum_threshold = 50
#autovacuum_analyze_threshold = 50
These two are set to the minimum threshold to avoid having small tables
get vacuum/analyzed continuously, but to make sure that small tables do
get vacuumed & analyzed sometimes.
#autovacuum_vacuum_scale_factor = 0.2
This is set because in my experience, 20% bloat is about the level at
which bloat starts affecting performance; thus, we want to vacuum at
that level but not sooner. This does mean that very large tables which
never have more than 10% updates/deletes don't get vacuumed at all until
freeze_age; this is a *good thing*. VACUUM on large tables is expensive;
you don't *want* to vacuum a billion-row table which has only 100,000
updates.
#autovacuum_analyze_scale_factor = 0.1
The 10% threshold for analyze is there because (a) analyze is cheap, and
(b) 10% changes to a table can result in very bad plans if the changes
are highly skewed towards a specific range, such as additions onto the
end of a time-based table.
The current postgres defaults were tested on DBT2 as well as pgbench,
and in my last 2 years of consulting I've seldom found reason to touch
them except on *specific* tables. So I still feel that they are good
defaults.
It would be worth doing a DBT2/DBT5 test run with different autovac
settings post-8.4 so see if we should specifically change the vacuum
threshold. Pending that, though, I think the current defaults are good
enough.
--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2010-05-01 03:08:20 | Re: autovacuum strategy / parameters |
Previous Message | Cédric Villemain | 2010-04-29 08:21:13 | Re: Optimization idea |