Quick Links

Re: [PATCHES] Infrastructure changes for recovery

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc:	List pgsql-patches <pgsql-patches(at)postgresql(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [PATCHES] Infrastructure changes for recovery
Date:	2008-09-29 01:16:01
Message-ID:	22856.1222650961@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers pgsql-patches

Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
>> It does nothing AFAICS for the
>> problem that when restarting archive recovery from a restartpoint,
>> it's not clear when it is safe to start letting in backends. You need
>> to get past the highest LSN that has made it out to disk, and there is
>> no good way to know what that is.

> AFAICS when we set minRecoveryLoc we *never* unset it. It's recorded in
> the controlfile, so whenever we restart we can see that it has been set
> previously and now we are beyond it.

Right ...

> So if we crash during recovery and
> then restart *after* we reached minRecoveryLoc then we resume in safe
> mode almost immediately.

Wrong.

What minRecoveryLoc is is an upper bound for the LSNs that might be
on-disk in the filesystem backup that an archive recovery starts from.
(Defined as such, it never changes during a restartpoint crash/restart.)
Once you pass that, the on-disk state as modified by any dirty buffers
inside the recovery process represents a consistent database state.
However, the on-disk state alone is not guaranteed consistent. As you
flush some (not all) of your shared buffers you enter other
not-certainly-consistent on-disk states. If we crash in such a state,
we know how to use the last restartpoint plus WAL replay to recover to
another state in which disk + dirty buffers are consistent. However,
we reach such a state only when we have read WAL to beyond the highest
LSN that has reached disk --- and in recovery mode there is no clean
way to determine what that was.

Perhaps a solution is to make XLogFLush not be a no-op in recovery mode,
but have it scribble a highest-LSN somewhere on stable storage (maybe
scribble on pg_control itself, or maybe better someplace else). I'm
not totally sure about that. But I am sure that doing nothing will
be unreliable.

regards, tom lane

In response to

Re: [PATCHES] Infrastructure changes for recovery at 2008-09-29 00:54:19 from Simon Riggs

Responses

Re: [PATCHES] Infrastructure changes for recovery at 2008-09-29 12:33:16 from Simon Riggs

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	pgsql	2008-09-29 01:27:52	Re: Ad-hoc table type?
Previous Message	Mark Mielke	2008-09-29 01:06:14	Re: Ad-hoc table type?

Browse pgsql-patches by date

	From	Date	Subject
Next Message	Ryan Bradetich	2008-09-29 05:12:38	Re: [PgFoundry] Unsigned Data Types [1 of 2]
Previous Message	Simon Riggs	2008-09-29 00:54:19	Re: [PATCHES] Infrastructure changes for recovery