Skip site navigation (1) Skip section navigation (2)

Re: how is pitr replay interruption time determined?

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-admin(at)postgresql(dot)org
Subject: Re: how is pitr replay interruption time determined?
Date: 2007-08-28 22:44:29
Message-ID: 1188341069.4218.93.camel@ebony.site (view raw or flat)
Thread:
Lists: pgsql-admin
On Tue, 2007-08-28 at 17:59 -0400, Tom Lane wrote:
> Robert Treat <xzilla(at)users(dot)sourceforge(dot)net> writes:
> > Is there some way to force checkpoints on a db doing wal replay? 
> 
> No, it's hardwired to do it when it sees a checkpoint record in the WAL stream.
> 
> > pg_control last modified:             Mon Aug 27 12:12:55 2007
> > Time of latest checkpoint:            Mon Jul 30 19:17:37 2007
> 
> After looking again at the code, the "last modified" time is the time
> that a recovery checkpoint was last done, and the "latest checkpoint"
> is the timestamp of the WAL-stream checkpoint record that triggered it.
> In a situation where you're catching up on historical WAL they could be
> far apart, but when a slave is just following the master there shouldn't
> be a huge difference --- not more than the maximum time to fill a WAL
> record and ship it over to the slave, for sure.
> 
> (BTW, I misread it before --- it looks like the "at log time" value
> printed at startup *is* taken from the checkpoint record that it's
> trying to roll forward from.)

That's correct. Sorry for not replying earlier; just back from hols.

Jumping back to original thought: Robert, you should be using the last
checkpoint location, not the last time to decide which xlogs to remove. 

> Assuming that you're absorbing data from the master at a steady rate,
> the only reason I can see for the timestamps to be so old is if the
> "rm_safe_restartpoint" checks are always failing.  I seem to remember
> that we found and fixed a bug that could cause something like that,
> but I can't find any trace of it in the CVS logs.  Simon, do you
> recall such a problem post-8.2.0?

Yeh, we traced a problem with GIN indexes to this cause in early June;
Teodor fixed it quickly in REL8_2_STABLE, but that won't be available
until 8.2.5.

I'd be happier with a log message to say 

  ereport(DEBUG2,
    (errmsg("RM %d not safe to record restart point at %X/%X",
			rmid,
			checkPoint->redo.xlogid,
			checkPoint->redo.xrecoff)));

to help trace such things in future.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


In response to

Responses

pgsql-admin by date

Next:From: Tom LaneDate: 2007-08-28 22:49:56
Subject: Re: how is pitr replay interruption time determined?
Previous:From: Tom LaneDate: 2007-08-28 21:59:20
Subject: Re: how is pitr replay interruption time determined?

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group