Skip site navigation (1) Skip section navigation (2)

Re: We need to log aborted autovacuums

From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: We need to log aborted autovacuums
Date: 2011-01-16 18:22:38
Message-ID: 4D33376E.2060909@2ndquadrant.com (view raw or flat)
Thread:
Lists: pgsql-hackers
Tom Lane wrote:
> No, I don't believe we should be messing with the semantics of
> try_relation_open.  It is what it is.
>   

With only four pretty simple callers to the thing, and two of them 
needing the alternate behavior, it seemed a reasonable place to modify 
to me.  I thought the "nowait" boolean idea was in enough places that it 
was reasonable to attach to try_relation_open.

Attached patch solves the "wait for lock forever" problem, and 
introduces a new log message when AV or auto-analyze fail to obtain a 
lock on something that needs to be cleaned up:

DEBUG:  autovacuum: processing database "gsmith"
INFO:  autovacuum skipping relation 65563 --- cannot open or obtain lock
INFO:  autoanalyze skipping relation 65563 --- cannot open or obtain lock

My main concern is that this may cause AV to constantly fail to get 
access to a busy table, where in the current code it would queue up and 
eventually get the lock needed.  A secondary issue is that while the 
autovacuum messages only show up if you have log_autovacuum_min_duration 
set to not -1, the autoanalyze ones can't be stopped.

If you don't like the way I structured the code, you can certainly do it 
some other way instead.  I thought this approach was really simple and 
not unlike similar code elsewhere.

Here's the test case that worked for me here again:

psql
SHOW log_autovacuum_min_duration;
DROP TABLE t;
CREATE TABLE t(s serial,i integer);
INSERT INTO t(i) SELECT generate_series(1,100000);
SELECT relname,last_autovacuum,autovacuum_count FROM pg_stat_user_tables 
WHERE relname='t';
DELETE FROM t WHERE s<50000;
\q
psql
BEGIN;
LOCK t;

Leave that open, then go to anther session with old "tail -f" on the 
logs to wait for the errors to show up.

-- 
Greg Smith   2ndQuadrant US    greg(at)2ndQuadrant(dot)com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


Attachment: av-lock-failure-v2.diff
Description: text/x-patch (7.1 KB)

In response to

Responses

pgsql-hackers by date

Next:From: Tom LaneDate: 2011-01-16 18:26:38
Subject: Re: We need to log aborted autovacuums
Previous:From: Dimitri FontaineDate: 2011-01-16 18:21:29
Subject: Re: pg_basebackup for streaming base backups

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group