Quick Links

Re: clog_redo causing very long recovery time

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Joseph Conway <mail(at)joeconway(dot)com>
Cc:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: clog_redo causing very long recovery time
Date:	2011-05-06 03:22:43
Message-ID:	1756.1304652163@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Joseph Conway <mail(at)joeconway(dot)com> writes:
> I'm working with a client that uses Postgres on what amounts to an
> appliance.

> The database is therefore subject to occasional torture such as, in this
> particular case, running out of disk space while performing a million
> plus queries (of mixed varieties, many using plpgsql with exception
> handling -- more on that later), and eventually being power-cycled. Upon
> restart, clog_redo was called approx 885000 times (CLOG_ZEROPAGE) during
> recovery, which took almost 2 hours on their hardware. I should note
> that this is on Postgres 8.3.x.

> After studying the source, I can only see one possible way that this
> could have occurred:

> In varsup.c:GetNewTracsactionId(), ExtendCLOG() needs to succeed on a
> freshly zeroed clog page, and ExtendSUBTRANS() has to fail. Both of
> these calls can lead to a page being pushed out of shared buffers and to
> disk, so given a lack of disk space, sufficient clog buffers, but lack
> of subtrans buffers, this could happen. At that point the transaction id
> does not get advanced, so clog zeros the same page, extendSUBTRANS()
> fails again, rinse and repeat.

> I believe in the case above, subtrans buffers were exhausted due to the
> extensive use of plpgsql with exception handling.

Hmm, interesting. I believe that it's not really a question of buffer
space or lack of it, but whether the OS will give us disk space when we
try to add a page to the current pg_subtrans file. In any case, the
point is that a failure between ExtendCLOG and incrementing nextXid
is bad news.

> The attached fix-clogredo diff is my proposal for a fix for this.

That seems pretty grotty :-(

I think a more elegant fix might be to just swap the order of the
ExtendCLOG and ExtendSUBTRANS calls in GetNewTransactionId. The
reason that would help is that pg_subtrans isn't WAL-logged, so if
we succeed doing ExtendSUBTRANS and then fail in ExtendCLOG, we
won't have written any XLOG entry, and thus repeated failures will not
result in repeated XLOG entries. I seem to recall having considered
exactly that point when the clog WAL support was first done, but the
scenario evidently wasn't considered when subtransactions were stuck
in :-(.

It would probably also help to put in a comment admonishing people
to not add stuff right there. I see the SSI guys have fallen into
the same trap.

regards, tom lane

In response to

clog_redo causing very long recovery time at 2011-05-02 06:26:05 from Joseph Conway

Responses

Re: clog_redo causing very long recovery time at 2011-05-06 03:29:13 from Alvaro Herrera
Re: clog_redo causing very long recovery time at 2011-05-06 03:41:10 from Joe Conway
Re: clog_redo causing very long recovery time at 2011-05-09 07:22:40 from Simon Riggs

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Alvaro Herrera	2011-05-06 03:29:13	Re: clog_redo causing very long recovery time
Previous Message	Tom Lane	2011-05-06 03:12:40	Why is RegisterPredicateLockingXid called while holding XidGenLock?