Re: autovacuum suggestions for 500,000,000+ row tables?

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Alex Stapleton <alexs(at)advfn(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: autovacuum suggestions for 500,000,000+ row tables?
Date: 2005-06-20 17:46:41
Message-ID: 200506201046.42062.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Alex,

> Hi, i'm trying to optimise our autovacuum configuration so that it
> vacuums / analyzes some of our larger tables better. It has been set
> to the default settings for quite some time. We never delete
> anything (well not often, and not much) from the tables, so I am not
> so worried about the VACUUM status, but I am wary of XID wraparound
> nuking us at some point if we don't sort vacuuming out so we VACUUM
> at least once every year ;)

I personally don't use autovaccuum on very large databases. For DW,
vacuuming is far better tied to ETL operations or a clock schedule of
downtime.

XID wraparound may be further away than you think. Try checking
pg_controldata, which will give you the current XID, and you can calculate
how long you are away from wraparound. I just tested a 200G data warehouse
and figured out that we are 800 months away from wraparound, despite hourly
ETL.

> However not running ANALYZE for such huge
> periods of time is probably impacting the statistics accuracy
> somewhat, and I have seen some unusually slow queries at times.
> Anyway, does anyone think we might benefit from a more aggressive
> autovacuum configuration?

Hmmm, good point, you could use autovacuum for ANALYZE only. Just set the
VACUUM settings preposterously high (like 10x) so it never runs. Then it'll
run ANALYZE only. I generally threshold 200, multiple 0.1x for analyze;
that is, re-analyze after 200+10% of rows have changed.

--
Josh Berkus
Aglio Database Solutions
San Francisco

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Dan Harris 2005-06-20 17:55:59 investigating slow queries through pg_stat_activity
Previous Message John Arbash Meinel 2005-06-20 15:20:37 Re: autovacuum suggestions for 500,000,000+ row tables?