Re: 9.2.3 crashes during archive recovery

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: 9.2.3 crashes during archive recovery
Date: 2013-02-13 09:04:13
Message-ID: 511B570D.4000609@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 13.02.2013 09:46, Kyotaro HORIGUCHI wrote:
> In this case, the FINAL consistency point is at the
> XLOG_SMGR_TRUNCATE record, but current implemet does not record
> the consistency point (checkpoint, or commit or smgr_truncate)
> itself, so we cannot predict the final consistency point on
> starting of recovery.

Hmm, what you did was basically:

1. Run server normally.
2. Kill it with "pg_ctl stop -m immediate".
3. Create a recovery.conf file, turning the server into a hot standby.

Without step 3, the server would perform crash recovery, and it would
work. But because of the recovery.conf file, the server goes into
archive recovery, and because minRecoveryPoint is not set, it assumes
that the system is consistent from the start.

Aside from the immediate issue with truncation, the system really isn't
consistent until the WAL has been replayed far enough, so it shouldn't
open for hot standby queries. There might be other, later, changes
already flushed to data files. The system has no way of knowing how far
it needs to replay the WAL to become consistent.

At least in back-branches, I'd call this a pilot error. You can't turn a
master into a standby just by creating a recovery.conf file. At least
not if the master was not shut down cleanly first.

If there's a use case for doing that, maybe we can do something better
in HEAD. If the control file says that the system was running
(DB_IN_PRODUCTION), but there is a recovery.conf file, we could do crash
recovery first, until we reach the end of WAL, and go into archive
recovery mode after that. We'd recover all the WAL files in pg_xlog as
far as we can, same as in crash recovery, and only start restoring files
from the archive once we reach the end of WAL in pg_xlog. At that point,
we'd also consider the system as consistent, and start up for hot standby.

I'm not sure that's worth the trouble, though. Perhaps it would be
better to just throw an error if the control file state is
DB_IN_PRODUCTION and a recovery.conf file exists. The admin can always
start the server normally first, shut it down cleanly, and then create
the recovery.conf file.

> On the other hand, updating control file on every commits or
> smgr_truncate's should slow the transactions..

To be precise, we'd need to update the control file on every
XLogFlush(), like we do during archive recovery. That would indeed be
unacceptable from a performance point of view. Updating the control file
that often would also be bad for robustness.

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jonathan Rogers 2013-02-13 09:29:53 Btrfs clone WIP patch
Previous Message Atri Sharma 2013-02-13 09:01:26 Fractal tree indexing