Thanks for your many comments and practical suggestions - most of which
I think I should be able to bash something out once I've got my new dev
I'll update the proposal into a design document with some of my earlier
blah taken out and all of your clarifications put in.
There's a few comments on stuff below:
Best Regards, Simon Riggs
>Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us] writes
> > - Write application to archive WAL files to tape, disk, or network
> > Probably need to do first part, but I'm arguing not to do the copy
> > tape..
> I'd like to somehow see this handled by a user-supplied program or
> script. What we mainly need is to define a good API that lets the
> archiver program understand which WAL segment files to archive when.
> > B - Backing up WAL log files
> > -Ordinarily, when old log segment files are no longer needed, they
> > recycled (renamed to become the next segments in the numbered
> > This means that the data within them must be copied from there to
> > another location
> > AFTER postgres has closed that file
> > BEFORE it is renamed and recycled
> My inclination would be to change the backend code so that as soon as
> WAL segment is completed, it is flagged as being ready to dump to tape
> (or wherever). Possibly the easiest way to do this is to rename the
> segment file somehow, perhaps "nnn" becomes "nnn.full". Then, after
> archiver process has properly dumped the file, reflag it as being
> (perhaps rename to "nnn.done"). Obviously there are any number of
> we could do this flagging, and depending on an OS rename facility
> not be the best.
> A segment then can be recycled when it is both (a) older than the
> checkpoint and (b) flagged as dumped. Note that this approach allows
> dumping of a file to start before the first time at which it could be
> recycled. In the event of a crash and restart, WAL replay has to be
> able to find the flagged segments, so the flagging mechanism can't be
> one that would make this impossible.
That sort of API doesn't do much for my sense of truth-and-beauty, but
it will work and allow us to get to the testing stage beyond where we
will, I'm sure, discover many things. When that knowledge is gained *we*
Spawning new post to think through the API in more detail.
> > With full OS file backup, if the database is shutdown correctly,
> > will need a way to tell the database "you think you're up to date,
> > you're not - I've added some more WAL files into the directories, so
> > roll forward on those now please".
> I do not think this is an issue either, because my vision of this does
> not include tar backups of shutdown databases. What will be backed up
> is a live database, therefore the postmaster will definitely know that
> it needs to perform WAL replay. What we will need is hooks to make
> that the full set of required log files is available.
OK, again lets go for it on that assumption.
Longer term, I would feel more comfortable with a specific "backup
state". Relying on a side-effect of crash recovery for disaster recovery
doesn't give me a warm feeling. BUT, that feeling is for later, not now.
> It's entirely
> possible that that set of log files exceeds available disk space, so
> needs to be possible to run WAL replay incrementally, loading and then
> replaying additional log segments after deleting old ones.
> Possibly we could do this with some postmaster command-line switches.
> J. R. Nield's patch embodied an "interactive recovery" backend mode,
> which I didn't like in detail but the general idea is not necessarily
Again, yes, though I will for now aim at the assumption that recovery
can be completed within available disk space, with this as an immediate
add-on when we have something that works.
That is also the basis for a "warm standby" solution: Copy the tar to a
new system (similar as you say), then repeatedly move new WAL logs
across to it, then startup in recover-only mode.
"Recover-only" mode would be initiated by a command line switch, as you
say. This would recover all of the WAL logs, then immediately shutdown
The extension to that is what Oli Sennhauser has suggested, which is to
allow the second system to come up in read-only mode.
Best Regards, Simon Riggs
In response to
pgsql-hackers by date
|Next:||From: markw||Date: 2004-02-16 23:41:07|
|Subject: Re: Proposed Query Planner TODO items |
|Previous:||From: Simon Riggs||Date: 2004-02-16 23:07:24|
|Subject: Re: Slow DROP INDEX|
pgsql-hackers-pitr by date
|Next:||From: Bruce Momjian||Date: 2004-02-17 19:40:14|
|Subject: Re: |
|Previous:||From: Tom Lane||Date: 2004-02-15 18:13:04|
|Subject: Re: Proposals for PITR |