Skip site navigation (1) Skip section navigation (2)

Re: 12 hour table vacuums

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Ron St-Pierre" <ron(dot)pgsql(at)shaw(dot)ca>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: 12 hour table vacuums
Date: 2007-10-23 16:38:29
Message-ID: 87y7dtokfu.fsf@oxford.xeocode.com (view raw or flat)
Thread:
Lists: pgsql-performance
"Ron St-Pierre" <ron(dot)pgsql(at)shaw(dot)ca> writes:

> We vacuum only a few of our tables nightly, this one is the last one because it
> takes longer to run. I'll probably re-index it soon, but I would appreciate any
> advice on how to speed up the vacuum process (and the db in general).
...
> vacuum_cost_delay = 200

Well speeding up vacuum isn't really useful in itself. In fact you have vacuum
configured to run quite slowly by having vacuum_cost_delay set so high. You
have it set to sleep 200ms every few pages. If you lower that it'll run faster
but take more bandwidth away from the foreground tasks.

> Here's the table information:
> The table has 140,000 rows, 130 columns (mostly NUMERIC), 60 indexes. 

For what it's worth NUMERIC columns take more space than you might expect.
Figure a minimum of 12 bytes your rows are at about 1.5k each even if the
non-numeric columns aren't large themselves. What are the other columns?

> We were running autovacuum but it interfered with the updates to we shut it
> off. 

Was it just the I/O bandwidth? I'm surprised as your vacuum_cost_delay is
quite high. Manual vacuum doesn't do anything differently from autovacuum,
neither should interfere directly with updates except by taking away
I/O bandwidth.

> We vacuum this table nightly, and it currently takes about 12 hours to
> vacuum it. Not much else is running during this period, nothing that should
> affect the table.

Is this time increasing over time? If once a day isn't enough then you may be
accumulating more and more dead space over time. In which case you may be
better off running it during prime time with a large vacuum_cost_delay (like
the 200 you have configured) rather than trying to get to run fast enough to
fit in the off-peak period.

> deadlock_timeout = 10000

I would not suggest having this quite this high. Raising it from the default
is fine but having a value larger than your patience is likely to give you the
false impression that something is hung if you should ever get a deadlock.

-- 
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com

In response to

Responses

pgsql-performance by date

Next:From: Bill MoranDate: 2007-10-23 16:41:48
Subject: Re: 12 hour table vacuums
Previous:From: Ron St-PierreDate: 2007-10-23 16:33:16
Subject: Re: 12 hour table vacuums

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group