Re: Table bloat and vacuum

From: Jack Orenstein <jack(dot)orenstein(at)hds(dot)com>
To: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Table bloat and vacuum
Date: 2008-11-12 18:42:34
Message-ID: 491B239A.3030903@hds.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Scott Marlowe wrote:
> On Wed, Nov 12, 2008 at 8:13 AM, Jack Orenstein <jack(dot)orenstein(at)hds(dot)com> wrote:
>> My application is running on 7.4. We have one huge table that drives
> SNIP
>> We're in the process of upgrading to 8.3.4, so I'd appreciate any
>> throughs on whether and how this behavior will change with the newer
>> release.
>
> You will not believe how much faster 8.3 is, and how much easier
> maintenance is. You'll be like a kid in a candy store for months
> looking at and using all the new features in it. The improvements are
> enormous. Biggest difference for you is that 8.3 can do vacuums in a
> background method (it sleeps x milliseconds between pages), can run 3
> or more threads, and autovacuum daemon is on by default. For the most
> part, your vacuuming issues will no longer exist.

Our 7.4 vacuuming strategy has gotten pretty complex:

- Entire database once a week.

- (Eventually) biggest table in database: Vacuumed/analyzed every 1000 updates
until there are 10,000 rows, to ensure that optimizer does the right thing,
(discussed recently on this mailing list).

- Medium-sized table containing single-row concurrency hotspots. Usually less
than 1M rows: vacuumed every 2000 updates.

- Single-row tables - these are permanent hotspots, updated in every
transaction: vacuumed every 2000 updates.

Can you comment on how I'll be able to simplify this vacuum schedule by relying
on autovacuum? Can you point me at a document describing how autovacuum decides
when to vacuum a table?

I've also had some trouble figuring out which VACUUMs should ANALYZE.
Originally, I had every vacuum also run analyze (except for the tiny-table
vacuums). But I ran into the "tuple concurrently updated" problem (see
http://archives.postgresql.org/pgsql-sql/2005-05/msg00148.php), so I've had to
back off from that. Are concurrent analyzes OK in 8.3?

Jack

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2008-11-12 18:46:22 Re: Upgrading side by side in Gentoo
Previous Message Sam Mason 2008-11-12 18:36:39 Re: how to "group" several records with same timestamp into one line?