Re: 12 hour table vacuums

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Ron St-Pierre" <ron(dot)pgsql(at)shaw(dot)ca>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: 12 hour table vacuums
Date: 2007-10-23 16:38:29
Message-ID: 87y7dtokfu.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

"Ron St-Pierre" <ron(dot)pgsql(at)shaw(dot)ca> writes:

> We vacuum only a few of our tables nightly, this one is the last one because it
> takes longer to run. I'll probably re-index it soon, but I would appreciate any
> advice on how to speed up the vacuum process (and the db in general).
...
> vacuum_cost_delay = 200

Well speeding up vacuum isn't really useful in itself. In fact you have vacuum
configured to run quite slowly by having vacuum_cost_delay set so high. You
have it set to sleep 200ms every few pages. If you lower that it'll run faster
but take more bandwidth away from the foreground tasks.

> Here's the table information:
> The table has 140,000 rows, 130 columns (mostly NUMERIC), 60 indexes.

For what it's worth NUMERIC columns take more space than you might expect.
Figure a minimum of 12 bytes your rows are at about 1.5k each even if the
non-numeric columns aren't large themselves. What are the other columns?

> We were running autovacuum but it interfered with the updates to we shut it
> off.

Was it just the I/O bandwidth? I'm surprised as your vacuum_cost_delay is
quite high. Manual vacuum doesn't do anything differently from autovacuum,
neither should interfere directly with updates except by taking away
I/O bandwidth.

> We vacuum this table nightly, and it currently takes about 12 hours to
> vacuum it. Not much else is running during this period, nothing that should
> affect the table.

Is this time increasing over time? If once a day isn't enough then you may be
accumulating more and more dead space over time. In which case you may be
better off running it during prime time with a large vacuum_cost_delay (like
the 200 you have configured) rather than trying to get to run fast enough to
fit in the off-peak period.

> deadlock_timeout = 10000

I would not suggest having this quite this high. Raising it from the default
is fine but having a value larger than your patience is likely to give you the
false impression that something is hung if you should ever get a deadlock.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Bill Moran 2007-10-23 16:41:48 Re: 12 hour table vacuums
Previous Message Ron St-Pierre 2007-10-23 16:33:16 Re: 12 hour table vacuums