Re: We need to log aborted autovacuums

From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: We need to log aborted autovacuums
Date: 2011-01-16 18:22:38
Message-ID: 4D33376E.2060909@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:
> No, I don't believe we should be messing with the semantics of
> try_relation_open. It is what it is.
>

With only four pretty simple callers to the thing, and two of them
needing the alternate behavior, it seemed a reasonable place to modify
to me. I thought the "nowait" boolean idea was in enough places that it
was reasonable to attach to try_relation_open.

Attached patch solves the "wait for lock forever" problem, and
introduces a new log message when AV or auto-analyze fail to obtain a
lock on something that needs to be cleaned up:

DEBUG: autovacuum: processing database "gsmith"
INFO: autovacuum skipping relation 65563 --- cannot open or obtain lock
INFO: autoanalyze skipping relation 65563 --- cannot open or obtain lock

My main concern is that this may cause AV to constantly fail to get
access to a busy table, where in the current code it would queue up and
eventually get the lock needed. A secondary issue is that while the
autovacuum messages only show up if you have log_autovacuum_min_duration
set to not -1, the autoanalyze ones can't be stopped.

If you don't like the way I structured the code, you can certainly do it
some other way instead. I thought this approach was really simple and
not unlike similar code elsewhere.

Here's the test case that worked for me here again:

psql
SHOW log_autovacuum_min_duration;
DROP TABLE t;
CREATE TABLE t(s serial,i integer);
INSERT INTO t(i) SELECT generate_series(1,100000);
SELECT relname,last_autovacuum,autovacuum_count FROM pg_stat_user_tables
WHERE relname='t';
DELETE FROM t WHERE s<50000;
\q
psql
BEGIN;
LOCK t;

Leave that open, then go to anther session with old "tail -f" on the
logs to wait for the errors to show up.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books

Attachment Content-Type Size
av-lock-failure-v2.diff text/x-patch 7.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2011-01-16 18:26:38 Re: We need to log aborted autovacuums
Previous Message Dimitri Fontaine 2011-01-16 18:21:29 Re: pg_basebackup for streaming base backups