Re: Proposals for PITR

From: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
To: "'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-hackers-pitr(at)postgresql(dot)org>
Subject: Re: Proposals for PITR
Date: 2004-02-16 23:07:25
Message-ID: 000501c3f4e1$a7b10100$c19d87d9@LaptopDellXP
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-hackers-pitr

Tom,

Thanks for your many comments and practical suggestions - most of which
I think I should be able to bash something out once I've got my new dev
env sorted.

I'll update the proposal into a design document with some of my earlier
blah taken out and all of your clarifications put in.

There's a few comments on stuff below:

Best Regards, Simon Riggs

>Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us] writes
> > - Write application to archive WAL files to tape, disk, or network
> > Probably need to do first part, but I'm arguing not to do the copy
to
> > tape..
>
> I'd like to somehow see this handled by a user-supplied program or
> script. What we mainly need is to define a good API that lets the
> archiver program understand which WAL segment files to archive when.
>
> > B - Backing up WAL log files
> > -Ordinarily, when old log segment files are no longer needed, they
are
> > recycled (renamed to become the next segments in the numbered
sequence).
> > This means that the data within them must be copied from there to
> > another location
> > AFTER postgres has closed that file
> > BEFORE it is renamed and recycled
>
> My inclination would be to change the backend code so that as soon as
a
> WAL segment is completed, it is flagged as being ready to dump to tape
> (or wherever). Possibly the easiest way to do this is to rename the
> segment file somehow, perhaps "nnn" becomes "nnn.full". Then, after
the
> archiver process has properly dumped the file, reflag it as being
dumped
> (perhaps rename to "nnn.done"). Obviously there are any number of
ways
> we could do this flagging, and depending on an OS rename facility
might
> not be the best.
>
> A segment then can be recycled when it is both (a) older than the
latest
> checkpoint and (b) flagged as dumped. Note that this approach allows
> dumping of a file to start before the first time at which it could be
> recycled. In the event of a crash and restart, WAL replay has to be
> able to find the flagged segments, so the flagging mechanism can't be
> one that would make this impossible.

That sort of API doesn't do much for my sense of truth-and-beauty, but
it will work and allow us to get to the testing stage beyond where we
will, I'm sure, discover many things. When that knowledge is gained *we*
can refactor.

Spawning new post to think through the API in more detail.

> > With full OS file backup, if the database is shutdown correctly,
then we
> > will need a way to tell the database "you think you're up to date,
but
> > you're not - I've added some more WAL files into the directories, so
> > roll forward on those now please".
>
> I do not think this is an issue either, because my vision of this does
> not include tar backups of shutdown databases. What will be backed up
> is a live database, therefore the postmaster will definitely know that
> it needs to perform WAL replay. What we will need is hooks to make
sure
> that the full set of required log files is available.

OK, again lets go for it on that assumption.

Longer term, I would feel more comfortable with a specific "backup
state". Relying on a side-effect of crash recovery for disaster recovery
doesn't give me a warm feeling. BUT, that feeling is for later, not now.

> It's entirely
> possible that that set of log files exceeds available disk space, so
it
> needs to be possible to run WAL replay incrementally, loading and then
> replaying additional log segments after deleting old ones.
> Possibly we could do this with some postmaster command-line switches.
> J. R. Nield's patch embodied an "interactive recovery" backend mode,
> which I didn't like in detail but the general idea is not necessarily
> wrong.

Again, yes, though I will for now aim at the assumption that recovery
can be completed within available disk space, with this as an immediate
add-on when we have something that works.

That is also the basis for a "warm standby" solution: Copy the tar to a
new system (similar as you say), then repeatedly move new WAL logs
across to it, then startup in recover-only mode.

"Recover-only" mode would be initiated by a command line switch, as you
say. This would recover all of the WAL logs, then immediately shutdown
again.

The extension to that is what Oli Sennhauser has suggested, which is to
allow the second system to come up in read-only mode.

Best Regards, Simon Riggs

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message markw 2004-02-16 23:41:07 Re: Proposed Query Planner TODO items
Previous Message Simon Riggs 2004-02-16 23:07:24 Re: [PATCHES] update i386 spinlock for hyperthreading

Browse pgsql-hackers-pitr by date

  From Date Subject
Next Message Bruce Momjian 2004-02-17 19:40:14 Re:
Previous Message Tom Lane 2004-02-15 18:13:04 Re: Proposals for PITR