Re: Proposal for Recover mode in pg_ctl (in 8.0)

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Mark Kirkwood <markir(at)coretech(dot)co(dot)nz>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Proposal for Recover mode in pg_ctl (in 8.0)
Date: 2004-11-07 09:01:51
Message-ID: 1099818111.6942.1796.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, 2004-11-06 at 23:29, Tom Lane wrote:
> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > If a further pg_ctl mode, recover, were implemented, this would allow a
> > fail safe mode for recovery.
>
> > e.g. pg_ctl -D datadir recover
>
> > pg_ctl could then check for the existence of a recovery.conf file and
> > return an error if none were found.
>

...

> A possibly more reliable interlock would involve having the postmaster
> probe during normal startup to see if there is already an archived WAL
> segment for what it thinks is the current segment. However there are
> several issues here: one is that if you're doing partial-log-file
> shipping, that isn't necessarily an error condition; another is that
> we don't know how to do such a probe unless more information is added
> to postgresql.conf. We could imagine adding another shell command
> string (something like "test -f ..." perhaps) but if the user gets it
> wrong he may still be left with no protection.
>

Yes, checking the archive is the safe way, but we don't know how to do
that unless restore_command has been successfully read in (currently
from recovery.conf). Putting it in postgresql.conf is the wrong place,
because that will likely be wrongly set when we restore, and I'm against
editing that file as part of a recovery...once edited, you could lose
all context and thus completely screw the recovery.

All the suggested change is about is trying to find a safe way to fail
if restore_command has not been set because recovery.conf is missing for
whatever reason.

I can't get very excited about this approach, because it only protects
> those people who (a) use pg_ctl to start the postmaster (not everyone)
> and (b) carefully follow the recovery directions (which the people you
> are worried about are very bad at, by hypothesis).
>

Well, I was trying to find a least-risk approach. Touching pg_ctl code
at this stage of beta-ness seemed more reliable than touching postmaster
code. pg_ctl doesn't catch everyone.

Placing the test in postmaster is the natural place for it, yes.

This additional keyword should never be placed in any startup script, so
the commonly used interface would not change, just the one used at
recovery time.

Documentation changes are one thing - and as you point out, if they miss
one thing in the manual, they'll miss three. I'm not expecting many
people to actually use an archive created by using copy or cp.

I'm with Mark on this: you need all the help you can get at 2am when you
just got called out of bed with a phone call from your boss *requesting*
you to recover your incredibly important production system.

I'll cut the doc changes first, then produce a slim patch on postmaster.

Best Regards, Simon Riggs

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2004-11-07 10:28:52 pgxs under Win32 for PL/Java
Previous Message Simon Riggs 2004-11-07 08:53:15 Re: Proposal for Recover mode in pg_ctl (in 8.0)