[BUG] Checkpointer on hot standby runs without looking checkpoint_segments

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org
Subject: [BUG] Checkpointer on hot standby runs without looking checkpoint_segments
Date: 2012-04-16 12:05:48
Message-ID: 20120416.210548.254416486.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello, this is bug report and a patch for it.

The first patch in the attachments is for 9.2dev and next one is
for 9.1.3.

On the current 9.2dev, IsCheckpointOnSchedule(at)checkpointer(dot)c does
not check against WAL segments written. This makes checkpointer
always run at the speed according to checkpoint_timeout
regardless of WAL advancing rate.

This leads to unexpected imbalance in the numbers of WAL segment
files between the master and the standby(s) for high advance rate
of WALs. And what is worse, the master would have much higher
chance to remove some WAL segments before the standby receives
them.

XLogPageRead()@xlog.c triggers checkpoint referring to WAL
segment advance. So I think this is a bug of bgwriter in 9.1. The
attached patches fix that on 9.2dev and 9.1.3 respctively.

In the backported version to 9.1.3, bgwriter.c is modified
instead of checkpointer.c in 9.2. And GetWalRcvWriteRecPtr() is
used as the equivalent of GetStandbyFlushRecPtr() in 9.2.

By the way, GetStandbyFlushRecPtr() acquires spin lock within. It
might be enough to read XLogCtl->recoveryLastRecPtr without lock
to make rough estimation, but I can't tell it is safe or
not. Same discussion could be for GetWalRcvWriteRecPtr() on
9.1.3.

However, it seems to work fine on a simple test.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

== My e-mail address has been changed since Apr. 1, 2012.

Attachment Content-Type Size
standby_checkpoint_segments_9.2dev_fix_20120416.patch text/x-patch 2.4 KB
standby_checkpoint_segments_9.1.3_fix_20120416.patch text/x-patch 2.7 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2012-04-16 12:14:09 Re: [BUG] Checkpointer on hot standby runs without looking checkpoint_segments
Previous Message Noah Misch 2012-04-16 10:25:15 nodes/*funcs.c inconsistencies