Re: warm standby server stops doing checkpoints after a while

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Frank Wittig <fw(at)weisshuhn(dot)de>
Cc: Postgres General <pgsql-general(at)postgresql(dot)org>
Subject: Re: warm standby server stops doing checkpoints after a while
Date: 2007-05-31 14:23:40
Message-ID: 29215.1180621420@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Frank Wittig <fw(at)weisshuhn(dot)de> writes:
> The problem is that the slave server stops checkpointing after some
> hours of working (about 24 to 48 hours of conitued log replay).

Hm ... look at RecoveryRestartPoint() in xlog.c. Could there be
something wrong with this logic?

/*
* Do nothing if the elapsed time since the last restartpoint is less than
* half of checkpoint_timeout. (We use a value less than
* checkpoint_timeout so that variations in the timing of checkpoints on
* the master, or speed of transmission of WAL segments to a slave, won't
* make the slave skip a restartpoint once it's synced with the master.)
* Checking true elapsed time keeps us from doing restartpoints too often
* while rapidly scanning large amounts of WAL.
*/
elapsed_secs = time(NULL) - ControlFile->time;
if (elapsed_secs < CheckPointTimeout / 2)
return;

The idea is that the slave (once in sync with the master) ought to
checkpoint every time it sees a checkpoint record in the master's
output. I'm not seeing a flaw but maybe there is one here, or somewhere
nearby. Are you sure the master is checkpointing?

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message gonzales 2007-05-31 14:26:20 Re: jdbc pg_hba.conf error
Previous Message Ray Stell 2007-05-31 14:12:31 Re: jdbc pg_hba.conf error