Re: Performance query about large tables, lots of concurrent access

From: Karl Wright <kwright(at)metacarta(dot)com>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Performance query about large tables, lots of concurrent access
Date: 2007-06-19 14:02:12
Message-ID: 4677E1E4.903@metacarta.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Gregory Stark wrote:
> "Karl Wright" <kwright(at)metacarta(dot)com> writes:
>
>> This particular run lasted four days before a VACUUM became essential. The
>> symptom that indicates that VACUUM is needed seems to be that the CPU usage of
>> any given postgresql query skyrockets. Is this essentially correct?
>
> Postgres is designed on the assumption that VACUUM is run regularly. By
> "regularly" we're talking of an interval usually on the order of hours, or
> even less. On some workloads some tables need to be vacuumed every 5 minutes,
> for example.

Fine - but what if the previous vacuum is still in progress, and does
not finish in 5 minutes?

>
> VACUUM doesn't require shutting down the system, it doesn't lock any tables or
> otherwise prevent other jobs from making progress. It does add extra i/o but
> there are knobs to throttle its i/o needs. The intention is that VACUUM run in
> the background more or less continually using spare i/o bandwidth.
>

This spare bandwidth is apparently hard to come by in my particular
application. That's the only way I can reconcile your information with
it taking 4 days to complete.

> The symptom of not having run vacuum regularly is that tables and indexes
> bloat to larger sizes than necessary. If you run "VACUUM VERBOSE" it'll tell
> you how much bloat your tables and indexes are suffering from (though the
> output is a bit hard to interpret).
>
> Table and index bloat slow things down but not generally by increasing cpu
> usage. Usually they slow things down by causing queries to require more i/o.
>

Yes, that's what I understood, which is why I was puzzled by the effects
I was seeing.

> It's only UPDATES and DELETES that create garbage tuples that need to be
> vacuumed though. If some of your tables are mostly insert-only they might need
> to be vacuumed as frequently or at all.
>

Well, the smaller tables don't change much, but the bigger tables have a
lively mix of inserts and updates, so I would expect these would need
vacuuming often.

I'll post again when I can find a vacuum schedule that seems to work.

Karl

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Karl Wright 2007-06-19 14:06:25 Re: Performance query about large tables, lots of concurrent access
Previous Message Kurt Overberg 2007-06-19 13:57:23 Maintenance question / DB size anomaly...