Re: PITR Phase 2 - Design Planning

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: PITR Phase 2 - Design Planning
Date: 2004-04-26 22:16:01
Message-ID: 1083017760.3018.240.camel@stromboli
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 2004-04-26 at 22:05, Bruce Momjian wrote:
> Simon Riggs wrote:
> > Transaction log files currently have timestamps, so that is
> > straightforward, but probably not the best we can do. We would
> > rollforward until the xlog file time > desired point in time.
> >
> > To make (2) work we would have to have a timestamp associated with each
> > transaction. This could be in one of two places:
> > 1. the transaction record in the clog
> > 2. the log record in the xlog
> > We would then recover the xlog record by record, until we found a record
> > that had a timestamp > desired point-in-time.
> >
> > Currently, neither of these places have a timestamp. Hmmmm. We can't use
> > pg_control because we are assuming that it needs recovery...
> >
> > I can't see any general way of adding a timestamp in any less than 2
> > bytes. We don't need a timezone. The timestamp could refer to a number
> > of seconds since last checkpoint; since this is limited already by a GUC
> > to force checkpoints every so often. Although code avoids a checkpoint
> > if no updates have taken place, we wouldn't be too remiss to use a
> > forced checkpoint every 32,000 seconds (9 hours).
> > Assuming that accuracy of the point-in-time was of the order of
> > seconds?? If we went to 0.1 second accuracy, we could checkpoint (force)
> > every 40 minutes or so. All of that seems too restrictive.
> > If we went to milliseconds, then we could use a 4 byte value and use a
> > checkpoint (force) every 284 hours or 1.5 weeks.
> > Thoughts?
>
> I was thinking --- how would someone know the time to use for restore?
> Certainly they will not know subsecond accuracy? Probably second-level
> accuracty is enough, _except_ when they want everything restored up to a
> DROP TABLE transaction or some major problem. Is there a way to give
> users a list of transactions on a log backup? Can we show them the
> username, database, or commands or something? Would they be able to
> restore up to a specific transaction in that case?
>
> Basically, we could give them sub-second recovery, but what value would
> it be?

Yes, you remind me of a whole train of thought...

There should be a switch to allow you to specify the txnid you wish to
recover up until as well.

You raise the point of how you know what time to recover to. That is in
fact the very hardest part of recovery for a DBA. That's a good reason
for being able to list xlog contents, as you can with Oracle. Sounds
like we need an XlogMiner utility...

> Can we show them the username, database, or commands or something?

Yes, that sounds fairly straightforward possible using a modification of
the ReadRecord functions at the bottom of xlog.c - which is why security
of the xlogs is important.

It's also a good reason for being able to pause and restart recovery, so
you can see what it's like before continuing further.

Usually you are trying to sync up the contents of the database with all
of the other things that were being updated too. Often these will define
either the required transaction contents, or give a time to use.

Whatever level of time accuracy you choose, we would always need to
handle the case where multiple transactions have been committed with
exactly the same time (after rounding) and yet we may wish to split
them. Rolling forward to a txnid would help there.

Best regards, Simon

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2004-04-26 22:20:01 Re: PITR Phase 2 - Design Planning
Previous Message Alvaro Herrera 2004-04-26 22:01:28 Re: PITR Phase 2 - Design Planning