Re: [PATCHES] Infrastructure changes for recovery

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: List pgsql-patches <pgsql-patches(at)postgresql(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] Infrastructure changes for recovery
Date: 2008-09-28 18:02:25
Message-ID: 9794.1222624945@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> On Thu, 2008-09-25 at 18:28 -0400, Tom Lane wrote:
>> After reading this for awhile, I realized that there is a rather
>> fundamental problem with it: it switches into "consistent recovery"
>> mode as soon as it's read WAL beyond ControlFile->minRecoveryPoint.
>> In a crash recovery situation that typically is before the last
>> checkpoint (if indeed it's not still zero), and what that means is
>> that this patch will activate the bgwriter and start letting in
>> backends instantaneously after a crash, long before we can have any
>> certainty that the DB state really is consistent.
>>
>> In a normal crash recovery situation this would be easily fixed by
>> simply not letting it go to "consistent recovery" state at all, but
>> what about recovery from a restartpoint? We don't want a slave that's
>> crashed once to never let backends in again. But I don't see how to
>> determine that we're far enough past the restartpoint to be consistent
>> again. In crash recovery we assume (without proof ;-)) that we're
>> consistent once we reach the end of valid-looking WAL, but that rule
>> doesn't help for a slave that's following a continuing WAL sequence.
>>
>> Perhaps something could be done based on noting when we have to pull in
>> a WAL segment from the recovery_command, but it sounds like a pretty
>> fragile assumption.

> Seems like we just say we only signal the postmaster if
> InArchiveRecovery. Archive recovery from a restartpoint is still archive
> recovery, so this shouldn't be a problem in the way you mention. The
> presence of recovery.conf overrides all other cases.

What that implements is my comment that we don't have to let anyone in
at all during a plain crash recovery. It does nothing AFAICS for the
problem that when restarting archive recovery from a restartpoint,
it's not clear when it is safe to start letting in backends. You need
to get past the highest LSN that has made it out to disk, and there is
no good way to know what that is.

Unless we can get past this problem the whole thing seems a bit dead in
the water :-(

>> * I'm a bit uncomfortable with the fact that the
>> IsRecoveryProcessingMode flag is read and written with no lock.

> It's not a dynamic state, so I can fix that inside
> IsRecoveryProcessingMode() with a local state to make check faster.

Erm, this code doesn't look like it can allow IsRecoveryProcessingMode
to become locally true in the first place? I guess you could fix it
by initializing IsRecoveryProcessingMode to true, but that seems likely
to break other places. Maybe better is to have an additional local
state variable showing whether the flag has ever been fetched from
shared memory.

The other issues don't seem worth arguing about ...

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2008-09-28 18:51:41 Re: Proposal: move column defaults into pg_attribute along with attacl
Previous Message Tom Lane 2008-09-28 17:15:49 Re: FSM rewrite: doc changes

Browse pgsql-patches by date

  From Date Subject
Next Message Tom Lane 2008-09-28 19:57:49 Re: [HACKERS] get_relation_stats_hook()
Previous Message Simon Riggs 2008-09-26 14:52:20 Re: [HACKERS] get_relation_stats_hook()