Re: Streaming Replication: Checkpoint_segment and wal_keep_segments on standby

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, "Sander, Ingo (NSN - DE/Munich)" <ingo(dot)sander(at)nsn(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Streaming Replication: Checkpoint_segment and wal_keep_segments on standby
Date: 2010-06-10 10:59:06
Message-ID: AANLkTil4pJO8LnC4YMCfMAs3zs0GYMQMORtlqckzHzdA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jun 10, 2010 at 7:19 PM, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>> --- 1902,1908 ----
>>          for standby purposes, and the number of old WAL segments
>> available
>>          for standbys is determined based only on the location of the
>> previous
>>          checkpoint and status of WAL archiving.
>> +         This parameter has no effect on a restartpoint.
>>          This parameter can only be set in the
>> <filename>postgresql.conf</>
>>          file or on the server command line.
>>         </para>
>
> Hmm, I wonder if wal_keep_segments should take effect during recovery too?
> We don't support cascading slaves, but if you have two slaves connected to
> one master (without an archive), and you perform failover to one of them,
> without wal_keep_segments the 2nd slave might not find all the files it
> needs in the new master. Then again, that won't work without an archive
> anyway, because we error out at a TLI mismatch in replication. Seems like
> this is 9.1 material..

Yep, since currently SR cannot get over the gap of TLI, wal_keep_segments
is not worth taking effect during recovery.

>> *** a/doc/src/sgml/wal.sgml
>> --- b/doc/src/sgml/wal.sgml
>> ***************
>> *** 424,429 ****
>> --- 424,430 ----
>>    <para>
>>     There will always be at least one WAL segment file, and will normally
>>     not be more than (2 + <varname>checkpoint_completion_target</varname>)
>> * <varname>checkpoint_segments</varname> + 1
>> +    or <varname>checkpoint_segments</> + <xref
>> linkend="guc-wal-keep-segments"> + 1
>>     files.  Each segment file is normally 16 MB (though this size can be
>>     altered when building the server).  You can use this to estimate space
>>     requirements for <acronym>WAL</acronym>.
>
> That's not true, wal_keep_segments is the minimum number of files retained,
> independently of checkpoint_segments. The corret formula is (2 +
> checkpoint_completion_target * checkpoint_segments, wal_keep_segments)

You mean that the maximum number of WAL files is: ?

max {
(2 + checkpoint_completion_target) * checkpoint_segments,
wal_keep_segments
}

Just after a checkpoint removes old WAL files, there might be wal_keep_segments
WAL files. Additionally, checkpoint_segments WAL files might be generated before
the subsequent checkpoint removes old WAL files. So I think that the maximum
number is

max {
(2 + checkpoint_completion_target) * checkpoint_segments,
wal_keep_segments + checkpoint_segments
}

Am I missing something?

>>    <para>
>> +    In archive recovery or standby mode, the server periodically performs
>> +    <firstterm>restartpoints</><indexterm><primary>restartpoint</></>
>> +    which are similar to checkpoints in normal operation: the server
>> forces
>> +    all its state to disk, updates the <filename>pg_control</> file to
>> +    indicate that the already-processed WAL data need not be scanned
>> again,
>> +    and then recycles old log segment files if they are in the
>> +    <filename>pg_xlog</> directory. Note that this recycling is not
>> affected
>> +    by <varname>wal_keep_segments</> at all. A restartpoint is triggered,
>> +    if at least one checkpoint record has been replayed since the last
>> +    restartpoint, every <varname>checkpoint_timeout</> seconds, or every
>> +    <varname>checkoint_segments</> log segments only in standby mode,
>> +    whichever comes first....
>
> That last sentence is a bit unclear. How about:
>
> A restartpoint is triggered if at least one checkpoint record has been
> replayed and <varname>checkpoint_timeout</> seconds have passed since last
> restartpoint. In standby mode, a restartpoint is also triggered if
> <varname>checkoint_segments</> log segments have been replayed since last
> restartpoint and at least one checkpoint record has been replayed since.

Thanks! Seems good.

>> ... In log shipping case, the checkpoint interval
>> +    on the standby is normally smaller than that on the master.
>> +   </para>
>
> What does that mean? Restartpoints can't be performed more frequently than
> checkpoints in the master because restartpoints can only be performed at
> checkpoint records.

Yes, that's what I meant.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2010-06-10 11:09:27 Re: parser handling of large object OIDs
Previous Message Fujii Masao 2010-06-10 10:36:14 Re: failover vs. read only queries