Re: [HACKERS] Autovacuum Improvements

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Bruce Momjian" <bruce(at)momjian(dot)us>
Cc: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>, "Russell Smith" <mr-russ(at)pws(dot)com(dot)au>, "Darcy Buskermolen" <darcyb(at)commandprompt(dot)com>, "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net>, "Pavan Deolasee" <pavan(at)enterprisedb(dot)com>, "Christopher Browne" <cbbrowne(at)acm(dot)org>, <pgsql-general(at)postgresql(dot)org>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Autovacuum Improvements
Date: 2007-01-22 18:41:22
Message-ID: 873b62gazx.fsf@stark.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers


"Bruce Momjian" <bruce(at)momjian(dot)us> writes:

> Yep, agreed on the random I/O issue. The larger question is if you have
> a huge table, do you care to reclaim 3% of the table size, rather than
> just vacuum it when it gets to 10% dirty? I realize the vacuum is going
> to take a lot of time, but vacuuming to relaim 3% three times seems like
> it is going to be more expensive than just vacuuming the 10% once. And
> vacuuming to reclaim 1% ten times seems even more expensive. The
> partial vacuum idea is starting to look like a loser to me again.

Well the answer is of course "that depends".

If you maintain the dead space at a steady state averaging 1.5% instead of 5%
your table is 3.33% smaller on average. If this is a DSS system that will
translate into running your queries 3.33% faster. It will take a lot of
vacuums before they hurt more than a 3%+ performance drop.

If it's an OLTP system the it's harder to figure. a 3.33% increase in data
density will translate to a higher cache hit rate but how much higher depends
on a lot of factors. In our experiments we actually got bigger boost in these
kinds of situations than the I expected (I expected comparable to the 3.33%
improvement). So it could be even more than 3.33%. But like said it depends.
If you already have the whole database cache you won't see any improvement. If
you are right on the cusp you could see a huge benefit.

It sounds like you're underestimating the performance drain 10% wasted space
has. If we found out that one routine was unnecessarily taking 10% of the cpu
time it would be an obvious focus of attention. 10% wasted space is going to
work out to about 10% of the i/o time.

It also sounds like we're still focused on the performance impact in absolute
terms. I'm much more interested in changing the performance characteristics so
they're predictable and scalable. It doesn't matter much if your 1kb table is
100% slower than necessary but it does matter if your 1TB table needs 1,000x
as much vacuuming as your 1GB table even if it's getting the same update
traffic.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Simon Riggs 2007-01-22 18:42:09 Re: [HACKERS] Autovacuum Improvements
Previous Message Bruce Momjian 2007-01-22 18:27:19 Re: [HACKERS] Autovacuum Improvements

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2007-01-22 18:42:09 Re: [HACKERS] Autovacuum Improvements
Previous Message Simon Riggs 2007-01-22 18:38:59 Re: Piggybacking vacuum I/O