Re: BUG #4879: bgwriter fails to fsync the file in recovery mode

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #4879: bgwriter fails to fsync the file in recovery mode
Date: 2009-06-25 19:29:24
Message-ID: 4A43D014.90309@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Tom Lane wrote:
> While nosing around the problem areas, I think I've found yet another
> issue here. The global bool InRecovery is only maintained correctly
> in the startup process, which wasn't a problem before 8.4. However,
> if we are making the bgwriter execute the end-of-recovery checkpoint,
> there are multiple places where it is tested that are going to be
> executed by bgwriter. I think (but am not 100% sure) that these
> are all the at-risk references:
> XLogFlush
> CheckPointMultiXact
> CreateCheckPoint (2 places)
> Heikki's latest patch deals with the tests in CreateCheckPoint (rather
> klugily IMO) but not the others. I think it might be better to fix
> things so that InRecovery is maintained correctly in the bgwriter too.

We could set InRecovery=true in CreateCheckPoint if it's a startup
checkpoint, and reset it afterwards. I'm not 100% sure it's safe to have
bgwriter running with InRecovery=true at other times. Grepping for
InRecovery doesn't show anything that bgwriter calls, but it feels safer
that way.

Hmm, I see another small issue. We now keep track of the "minimum
recovery point". Whenever a data page is flushed, we set minimum
recovery point to the LSN of the page in XLogFlush(), instead of
fsyncing WAL like we do in normal operation. During the end-of-recovery
checkpoint, however, RecoveryInProgress() returns false, so we don't
update minimum recovery point in XLogFlush(). You're unlikely to be
bitten by that in practice; you would need to crash during the
end-of-recovery checkpoint, and then set the recovery target to an
earlier point. It should be fixed nevertheless.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Heikki Linnakangas 2009-06-25 19:37:17 Re: BUG #4879: bgwriter fails to fsync the file in recovery mode
Previous Message Simon Riggs 2009-06-25 19:14:25 Re: BUG #4879: bgwriter fails to fsync the file in recovery mode