Proposal for Recover mode in pg_ctl (in 8.0)

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Proposal for Recover mode in pg_ctl (in 8.0)
Date: 2004-11-06 22:05:20
Message-ID: 1099778720.6942.444.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Joachim Wieland has diligently and sensibly pointed out a potential for
user error with the current PITR implementation. This is not a bug *per
se*, but is a design flaw that more than one person could fall into. It
is a minor issue and not that likely, since the manual describes what to
do...but I'm proposing a possible solution nonetheless, since this error
is likely to occur whilst learning the PITR functionality and might
potentially dissuade users from using the approach.

Following restoration of a base backup, archive recovery is entered by
placing a recovery.conf file in the data directory and then restarting
the server using pg_ctl start. If the recovery.conf file is misnamed,
e.g. Recovery.conf, or if it is misplaced, or simply absent, then the
server start will not enter archive recovery, yet will start normally.
The precise difference might well not be apparent. Subsequent server
activity could potentially overwrite archived log files in a poorly
managed archive. Both of those situations are likely to occur
simultaneously with inexperienced users.

The first fix to this is clearly to document the possibility and warn
the user of this possibility - i.e. describe what NOT to do when
invoking archive recovery. This will be submitted shortly.

A further fix is to implement a fail safe mode for invoking recovery.
Rather than making the recovery a normal server start, which then
searches for recovery.conf, it would be better to indicate that the
server start is expected to be a recovery and so the absence of a
recovery.conf should be regarded as an error.

If a further pg_ctl mode, recover, were implemented, this would allow a
fail safe mode for recovery.

e.g. pg_ctl -D datadir recover

pg_ctl could then check for the existence of a recovery.conf file and
return an error if none were found. This mode would not then need to be
passed through to the postmaster as an option, which could accidentally
be re-invoked later should a crash recovery occur (the main reason to
avoid adding recovery.conf options to postgresql.conf in the first
place).

This mode would do nothing more than:
- check for recovery.conf, if not found, return error
- invoke a start normally, as if mode=start had been requested

The doc for invoking recovery could then be changed to include this new
mode, and the potential for error would be removed.

This is a change requested for 8.0, to ensure that PITR doesn't fall
into disrepute by novice users shooting themselves in the foot. It is a
minor change, effecting only PITR, and only to the extent of a
documentation change and a file existence check in pg_ctl. No server
code would be changed.

Alternative suggestions are welcome. Thoughts?

--
Best Regards, Simon Riggs

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2004-11-06 23:12:22 Re: relative_path() seems overly complicated and buggy
Previous Message Bruce Momjian 2004-11-06 21:39:21 Re: relative_path() seems overly complicated and buggy