Re: Autovacuum vs statement_timeout

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Autovacuum vs statement_timeout
Date: 2007-03-30 17:14:35
Message-ID: 21307.1175274875@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> Heikki Linnakangas <heikki(at)enterprisedb(dot)com> writes:
>> statement_timeout interrupts seem to go through the PG_CATCH-block and
>> clean up the entry from the vacuum cycle array as they should. But a
>> SIGINT leading to a "terminating connection due to administrator
>> command" error does not.

> Hm, that's an interesting thought, but there are no "terminating
> connection" messages in Shuttleworth's logs either. So we still lack
> the right idea there. (BTW it would be SIGTERM not SIGINT.)

Hold it ... stop the presses ... the reason we saw no "terminating
connection" messages was he was grepping his logs for lines containing
ERROR. Once we look for FATAL too, there are a pile of 'em. I'm not
100% convinced that any are from autovacuum processes, but clearly
*something* is throwing SIGTERM around with abandon in his test
environment. So at this point your theory above looks like a plausible
mechanism for the vacuum cycle array to slowly fill up and eventually
make _bt_start_vacuum fail (or, perhaps, fail sooner than that due to
a repeat vacuum attempt).

>> I think we need to add the xid of the vacuum transaction in the vacuum
>> cycle array, and clean up orphaned entries in _bt_start_vacuum. We're
>> going to have a hard time plugging every leak one-by-one otherwise.

> You're thinking too small --- what this thought actually suggests is
> that PG_CATCH can't be used to clean up shared memory at all, and I
> don't think we want to accept that. (I see several other places already
> where we assume we can do that. We could convert each one into an
> on_proc_exit cleanup operation, maybe, but that seems messy and not very
> scalable.) I'm thinking we may want to redesign elog(FATAL) processing
> so that we escape out to the outer level before calling proc_exit,
> thereby allowing CATCH blocks to run first.

I was hoping we could do that just as an 8.3 change, but it's now
starting to look like we might have to back-patch it, depending on how
much we care about surviving random SIGTERM attempts. I'd like to wait
for some report from Mark about what's causing all the SIGTERMs before
we evaluate that.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Florian G. Pflug 2007-03-30 17:15:19 Re: CREATE INDEX and HOT - revised design
Previous Message Pavan Deolasee 2007-03-30 16:59:26 Re: CREATE INDEX and HOT - revised design