Re: 7.4RC1 planned for Monday

From: Christopher Browne <cbbrowne(at)libertyrms(dot)info>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: 7.4RC1 planned for Monday
Date: 2003-10-31 21:12:53
Message-ID: 60ad7h5cx6.fsf@dev6.int.libertyrms.info
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

scott(dot)marlowe(at)ihs(dot)com ("scott.marlowe") writes:
> On Thu, 30 Oct 2003, Joshua D. Drake wrote:
>> If I understood correctly, Josh was complaining about VACUUM sucking too
>> >much of his disk bandwidth. autovacuum wouldn't help that --- in fact
>> >would likely make it worse, since a cron-driven vacuum script can at
>> >least be scheduled for low-load times of day. autovacuum is likely to
>> >kick in at the least convenient times.

>> Exactly!

> Wait a minute, I thought the problem was that vacuums were happening
> too far apart, therefore taking too long, and may have been full,
> no?

No, that is a different issue.

> If the autovacuum daemon causes a lazy vacuum to run on only the
> busiest (i.e. most updated) tables, then it is likely to only take a
> few minutes to run, instead of hours, plus you can try to keep
> things steady state. I.e. no more than x% or so dead tuples in a
> table at any given time, and they all get reused by fsm / lazy
> vacuum.

That is fine, for a system that isn't already "pretty much pegged"
with transaction load.

The "disk bandwidth" problem occurs when the system is already so busy
that doing a VACUUM on a big table adds a huge I/O load, killing
cache, and slowing the other activity.

> So, have you TESTED the autovacuum daemon with your load, and set it
> to run every 5 minutes? Keep in mind, it won't actually vacuum
> every table every 5 minutes, it'll just check the stats to see which
> ones have updated a fair bit and vacuum those, and they're lazy
> vacuums. I've found it to be a win. If you haven't tested it and
> dismissed it out of hand, then you should really give it a try to
> see if it can be configured to provide good performance and
> behavior.

If the I/O bus is saturated, and you are doing a lot of updates to big
tables, then the vacuums _are_ "performance killers."

The result of running pg_autovacuum on those tables would be that
there would be a near-continuous system slowdown. Not a win. Two
things are prime causes for this:

1. VACUUM rips through the page cache, loading the pages of tables
being vacuumed, and throwing away other data being frequently
accessed.

2. VACUUM has to compete with other processing for I/O.

Neither of those factors can be alleviated by vacuuming more often.

Jan has seen this phenomenon; I have seen this phenomenon; I have no
reason to think that Jason is not describing the very same phenomenon.

pg_autovacuum is well and useful, and I hesitate to try to count how
many systems I have installed it on. Probably a dozen. I have added
about as many patches to it as has Matthew O'Connor; I have a fair
idea of what it does. It is a godsend in test systems or low traffic
environments, by virtue of cutting down on the need to manually do
vacuums or to script up cron jobs.

It's exactly what is needed to make PostgreSQL usable in the long term
for hosting small web apps, or to make PostgreSQL work well as a host
for desktop applications. I'd like to see GnuCash use PostgreSQL by
default, instead of its custom XML data format, and pg_autovacuum
would be part of what would make that mix work.

But it isn't a magical solution to all ills, and the scenarios that
Jan Wieck and Jason Drake have been describing represent the
pathological cases where pg_autovacuum can cause performance problems
of its own.
--
output = reverse("ofni.smrytrebil" "@" "enworbbc")
<http://dev6.int.libertyrms.com/>
Christopher Browne
(416) 646 3304 x124 (land)

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Larry Rosenman 2003-10-31 21:18:10 Regression Failure: CURRENT SOURCES/union&join
Previous Message elein 2003-10-31 20:55:12 Re: Annotated release notes