Re: We need to log aborted autovacuums

From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: We need to log aborted autovacuums
Date: 2011-01-07 04:33:23
Message-ID: 4D269793.6030406@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Josh Berkus wrote:
>> Or should it perhaps be a per-table counter in pg_stat_user_tables,
>> given your statement above?
>>
>
> Or even a timestamp: last_autovacuum_attempt, which would record the
> last time autovacuum was tried. If that's fairly recent and you have a
> large number of dead rows, you know what kind of problem you have and
> can turn on debug.
>

These are both reasonable ideas. But there was just some kickback on
Tomas's "keeping timestamp of the lasts stats reset" patch recently,
from the perspective of trying to limit per-table stats bloat. I think
it's relatively easy to make a case that this situation is difficult
enough to diagnose that a little bit of extra low-level logging is
worthwhile. That Josh and I have both been bit by it enough to be
thinking about patches to make it easier to diagnost suggests it's
obviously too hard to nail down. But is this so common and difficult to
recognize that it's worth making all the table stats bigger? That's a
harder call.

It's already possible to detect the main symptom--dead row percentage is
much higher than the autovacuum threshold, but there's been no recent
autovacuum. That makes me less enthusiastic that there's such a genuine
need to justify the overhead of storing more table stats just to detect
the same thing a little more easily. I've been playing with the Munin
PG plug-in more recently, and I was just thinking of adding a dead row
trend graph/threshold to it to address this general area instead.

We could argue both sides of the trade-off of tracking this directly in
stats for some time, and I'd never expect there to be a clear victory
for either perspective. I've run into this vacuum problem a few times,
but certainly less than I've run into "why is the stats table so huge?"

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services and Support www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ramanujam 2011-01-07 05:10:54 [TEST REPORT] 9.1Alpha3 Feature E.1.4.7.2 in release notes.
Previous Message Greg Smith 2011-01-07 04:11:36 Re: We need to log aborted autovacuums