Re: "Stand-in" server recovery techniques

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Scott Marlowe" <scott(dot)marlowe(at)gmail(dot)com>
Cc: <pgsql-admin(at)postgresql(dot)org>
Subject: Re: "Stand-in" server recovery techniques
Date: 2007-08-23 13:43:54
Message-ID: 46CD48C8.EE98.0025.0@wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Thanks for the reply, Scott.

>>> On Wed, Aug 22, 2007 at 5:04 PM, in message
<dcc563d10708221504k59e8961fyb830ec8eb83598fc(at)mail(dot)gmail(dot)com>, "Scott Marlowe"
<scott(dot)marlowe(at)gmail(dot)com> wrote:
> On 8/22/07, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
>
>> How do we create the PostgreSQL instance on the stand-in box?
>>
>> (1) Restore the latest base backup and apply all WAL files available.
>> This is likely to be the slowest option.
>
> I'm leaning towards option 1, because you mentioned that you will be
> doing this ONLY in the event the primary on site server can't come
> back up. So, I'm going to assume that you're going to spend a few
> minutes (30 or so) trying to resurrect the primary server.
>
> WHILE doing that, I would have the backup server running the WAL files
> to get ready to go.
>
> The nice thing about this setup is that it's the simplest to
> implement, hence the least error prone

Point taken. That would mean kicking off a process while still in the early
stages of analyzing the problem, and the results would often not be used,
but the cost is minimal.

>> (2) Kick the warm standby on "the farm" into production mode, shut down
>> the instance, and then copy the instance directory. This should be
>> relatively quick and safe, but has the down side of needing to restart
>> the warm standby from the latest base backup afterwards, if that is even
>> possible. It seems like we might need to make a fresh base backup from the
>> (stopped) instance on the warm standby farm.
>
> I'm not sure exactly what you're saying here. If you're saying what I
> think you're saying, then you're already constantly playing the WAL
> files as they arrive from off site. If that's the case then this
> option seems quite attractive in terms of getting you back up and
> running fast.
>
> Since the standby would now become production, making a snapshot of
> the standby before making it production would mean that you can then
> use the snapshot elsewhere for the standby.
>
> If I understand what you're saying correctly.

I'm not talking about running production on the warm standby farm. Given
the load of running all the other warm standby instances, I don't believe
that performance would be acceptable. What I'm talking about is creating
the semaphore file which would kick the PostgreSQL instance for that county
out of archive recovery mode. (Our script would see the semaphore file and
exit with "file not found" when it was unable to immediately provide the
requested file.) The database cluster state reported by pg_controldata
switches from "in archive recovery" to "in production" at that point. We
would then stop that PostgreSQL instance and scp it to the stand-in server.
We would start that up and be in business again within a few minutes.

My biggest concern with this approach is how we would get back into a
backed-up state. I don't want to burden the stand-in machine with a PITR
base backup when users are working remotely and trying to make up for lost
time. If I could somehow use the copy from the warm standby box to make a
base, that would be ideal. That seems like it should be possible, since it
is the starting point from which we would then be generating WAL files in
production, but I'm not confident that I'd know what I was doing.

I could wait for day-end and keep my fingers crossed that the stand-in
didn't also fail, but that's not an entirely comfortable position, even
with our transaction repository.

-Kevin

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Campbell, Lance 2007-08-23 16:22:02 Installing Postgres
Previous Message Shane Ambler 2007-08-23 12:51:51 Re: Possibilities of PgSQL