Re: Point in Time Recovery

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Zeugswetter Andreas SB SD <ZeugswetterA(at)spardat(dot)at>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Point in Time Recovery
Date: 2004-07-28 16:06:32
Message-ID: 200407281606.i6SG6Wq25696@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


We need someone to code two backend functions to complete PITR. The
function would be called at start/stop of backup of the data directory.
The functions would be checked during restore to make sure the requested
xid is not between the start/stop xids of the backup. They would also
contain timestamps so the admin can easily review the archive directory.

The start needs to call checkpoint and create file in the data directory
that contains a few server parameters. At backup stop the function
needs to move the file to pg_xlog and set the *.ready archive flag so it
is archived.

As for checking during recover, the file needs to be retrieved and
checked to see the xid recovery is valid. Tom and I can help you with
that detail.

DOn't worry about all the details of the email below. It is just a
general summary. We can give you details once you volunteer.

---------------------------------------------------------------------------

Bruce Momjian wrote:
>
> OK, I think I have some solid ideas and reasons for them.
>
> First, I think we need server-side functions to call when we start/stop
> the backup. The advantage of these server-side functions is that they
> will do the required work of recording the pg_control values and
> creating needed files with little chance for user error. It also allows
> us to change the internal operations in later releases without requiring
> admins to change their procedures. We are even able to adjust the
> internal operation in minor releases without forcing a new procedure on
> users.
>
> Second, I think once we start a restore, we should rename recovery.conf
> to recovery.in_progress, and when complete rename that to
> recovery.done. If the postmaster starts and sees recovery.in_progress,
> it will fail to start knowing its recovery was interrupted. This allows
> the admin to take appropriate action. (I am not sure what that action
> would be. Does he bring back the backup files or just keep going?)
>
> Third, I think we need to put a file in the archive location once we
> complete a backup, recording the start/stop xid and wal/offsets. This
> gives the admin documentation on what archive logs to keep and what xids
> are available for recovery. Ideally the recover program would read that
> file and check the recover xid to make sure it is after the stop xid
> recorded in the file.
>
> How would the recover program know the name of that file? We need to
> create it in /data with start contents before the backup, then complete
> it with end contents and archive it.
>
> What should we name it? Ideally it would be named by the WAL
> name/offset of the start so it orders in the proper spot in the archive
> file listing, e.g.:
>
> 000000000000093a
> 000000000000093b
> 000000000000093b.032b9.start
> 000000000000093c
>
> Are people going to know they need 000000000000093b for
> 000000000000093b.032b9.start? I hope so. Another idea is to do:
>
>
> 000000000000093a.xlog
> 000000000000093b.032b9.start
> 000000000000093b.xlog
> 000000000000093c.xlog
>
> This would order properly. It might be a very good idea to add
> extensions to these log files now that we are archiving them in strange
> places. In fact, maybe we should use *.pg_xlog to document the
> directory they came from.
>
> ---------------------------------------------------------------------------
>
>
> Simon Riggs wrote:
> > On Fri, 2004-07-16 at 16:47, Tom Lane wrote:
> > > As far as the business about copying pg_control first goes: there is
> > > another way to think about it, which is to copy pg_control to another
> > > place that will be included in your backup. For example the standard
> > > backup procedure could be
> > >
> > > 1. [somewhat optional] Issue CHECKPOINT and wait till it finishes.
> > >
> > > 2. cp $PGDATA/global/pg_control $PGDATA/pg_control.dump
> > >
> > > 3. tar cf /dev/mt $PGDATA
> > >
> > > 4. do something to record ending WAL position
> > >
> > > If we standardized on this way, then the tar archive would automatically
> > > contain the pre-backup checkpoint position in ./pg_control.dump, and
> > > there is no need for any special assumptions about the order in which
> > > tar processes things.
> > >
> >
> > Sounds good. That would be familiar to Oracle DBAs doing BACKUP
> > CONTROLFILE. We can document that and off it as a suggested procedure.
> >
> > > However, once you decide to do things like that, there is no reason why
> > > the copied file has to be an exact image of pg_control. I claim it
> > > would be more useful if the copied file were plain text so that you
> > > could just "cat" it to find out the starting WAL position; that would
> > > let you determine without any special tools what range of WAL archive
> > > files you are going to need to bring back from your archives.
> >
> > I wouldn't be in favour of a manual mechanism. If you want an automated
> > mechanism, whats wrong with using the one thats already there? You can
> > use pg_controldata to read the controlfile, again whats wrong with that?
> >
> > We agreed some time back that an off-line xlog file inspector would be
> > required to allow us to inspect the logs and make a decision about where
> > to end recovery. You'd still need that.
> >
> > It's scary enough having to specify the end point, let alone having to
> > specify the starting point as well.
> >
> > At your request, and with Bruce's idea, I designed and built the
> > recovery system so that you don't need to know what range of xlogs to
> > bring back. You just run it, it brings back the right files from archive
> > and does recovery with them, then cleans up - and it works without
> > running out of disk space on long recoveries.
> >
> > I've built it now and it works...
> >
> > > This is pretty much the same chain of reasoning that Bruce and I went
> > > through yesterday to come up with the idea of putting a label file
> > > inside the tar backups. We concluded that it'd be worth putting
> > > both the backup starting time and the checkpoint WAL position into
> > > the label file --- the starting time isn't needed for restore but
> > > might be really helpful as documentation, if you needed to verify
> > > which dump file was which.
> >
> > ...if you are doing tar backups...what will you do if you're not using
> > that mechanism?
> >
> > If you are: It's common practice to make up a backup filename from
> > elements such as systemname, databasename, date and time etc. That gives
> > you the start time, the file last mod date gives you the end time.
> >
> > I think its perfectly fine for everybody to do backups any way they
> > please. There are many licenced variants of PostgreSQL and it might be
> > appropriate in those to specify particular ways of doing things.
> >
> > I'll be trusting the management of backup metadata and storage media to
> > a solution designed for the purpose (open or closed source), just as
> > I'll be trusting my data to a database solution designed for that
> > purpose. That for me is one of the good things about PostgreSQL - we use
> > the filesystem, we don't write our own, we provide language interfaces
> > not invent our own proprietary server language etc..
> >
> > Best Regards, Simon Riggs
> >
>
> --
> Bruce Momjian | http://candle.pha.pa.us
> pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
> + If your life is a hard drive, | 13 Roberts Road
> + Christ can be your backup. | Newtown Square, Pennsylvania 19073
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faqs/FAQ.html
>

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2004-07-28 16:19:50 Re: Point in Time Recovery
Previous Message Tom Lane 2004-07-28 14:29:31 Re: [HACKERS] Cannot initdb in cvs tip