Re: Performance query about large tables, lots of concurrent access

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Karl Wright" <kwright(at)metacarta(dot)com>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Performance query about large tables, lots of concurrent access
Date: 2007-06-19 13:46:18
Message-ID: 87bqfc83bp.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


"Karl Wright" <kwright(at)metacarta(dot)com> writes:

> This particular run lasted four days before a VACUUM became essential. The
> symptom that indicates that VACUUM is needed seems to be that the CPU usage of
> any given postgresql query skyrockets. Is this essentially correct?

Postgres is designed on the assumption that VACUUM is run regularly. By
"regularly" we're talking of an interval usually on the order of hours, or
even less. On some workloads some tables need to be vacuumed every 5 minutes,
for example.

VACUUM doesn't require shutting down the system, it doesn't lock any tables or
otherwise prevent other jobs from making progress. It does add extra i/o but
there are knobs to throttle its i/o needs. The intention is that VACUUM run in
the background more or less continually using spare i/o bandwidth.

The symptom of not having run vacuum regularly is that tables and indexes
bloat to larger sizes than necessary. If you run "VACUUM VERBOSE" it'll tell
you how much bloat your tables and indexes are suffering from (though the
output is a bit hard to interpret).

Table and index bloat slow things down but not generally by increasing cpu
usage. Usually they slow things down by causing queries to require more i/o.

It's only UPDATES and DELETES that create garbage tuples that need to be
vacuumed though. If some of your tables are mostly insert-only they might need
to be vacuumed as frequently or at all.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Alvaro Herrera 2007-06-19 13:46:37 Re: Performance query about large tables, lots of concurrent access
Previous Message Chris Browne 2007-06-19 13:39:02 Re: [pgsql-advocacy] [PERFORM] Postgres VS Oracle