Re: Remove secondary checkpoint

From: "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>
To: 'Simon Riggs' <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Remove secondary checkpoint
Date: 2017-10-25 02:24:11
Message-ID: 0A3221C70F24FB45833433255569204D1F80BA4C@G01JPEXMBYT05
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

From: pgsql-hackers-owner(at)postgresql(dot)org
> [mailto:pgsql-hackers-owner(at)postgresql(dot)org] On Behalf Of Simon Riggs
> This
> * reduces disk space requirements on master
> * removes a minor bug in fast failover
> * simplifies code

I welcome this patch. I was wondering why PostgreSQL retains the previous checkpoint. (Although unrelated to this, I've also been wondering why PostgreSQL flushes WAL to disk when writing a page in the shared buffer, because PostgreSQL doesn't use WAL for undo.)

Here are my review comments.

(1)
- * a) we keep WAL for two checkpoint cycles, back to the "prev" checkpoint.
+ * a) we keep WAL for only one checkpoint cycle (prior to PG11 we kept
+ * WAL for two checkpoint cycles to allow us to recover from the
+ * secondary checkpoint if the first checkpoint failed, though we
+ * only did this on the master anyway, not on standby. Keeping just
+ * one checkpoint simplifies processing and reduces disk space in
+ * many smaller databases.)

/*
- * The last valid checkpoint record required for a streaming
- * recovery exists in neither standby nor the primary.
+ * We used to attempt to go back to a secondary checkpoint
+ * record here, but only when not in standby_mode. We now
+ * just fail if we can't read the last checkpoint because
+ * this allows us to simplify processing around checkpoints.
*/

if (fast_promote)
{
- checkPointLoc = ControlFile->prevCheckPoint;
+ /*
+ * It appears to be a bug that we used to use prevCheckpoint here
+ */
+ checkPointLoc = ControlFile->checkPoint;

- XLByteToSeg(PriorRedoPtr, _logSegNo, wal_segment_size);
+ /* Trim from the last checkpoint, not the last - 1 */
+ XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size);

I suggest to remove references to the secondary checkpoint (the old behavior) from the comments. I'm afraid those could confuse new developers.

(2)
@@ -8110,10 +8100,6 @@ ReadCheckpointRecord(XLogReaderState *xlogreader, XLogRecPtr RecPtr,
ereport(LOG,
(errmsg("invalid primary checkpoint link in control file")));
break;

@@ -8135,10 +8121,6 @@ ReadCheckpointRecord(XLogReaderState *xlogreader, XLogRecPtr RecPtr,
ereport(LOG,
(errmsg("invalid primary checkpoint record")));
break;

@@ -8154,10 +8136,6 @@ ReadCheckpointRecord(XLogReaderState *xlogreader, XLogRecPtr RecPtr,
ereport(LOG,
(errmsg("invalid resource manager ID in primary checkpoint record")));
break;

etc, etc.

"primary checkpoint" should be just "checkpoint", now that the primary/secondary concept will disappear.

(3)
Should we change the default value of max_wal_size from 1 GB to a smaller size? I vote for "no" for performance.

Regards
Takayuki Tsunakawa

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2017-10-25 04:36:13 Re: Remove secondary checkpoint
Previous Message Tsunakawa, Takayuki 2017-10-25 01:48:43 Re: Remove secondary checkpoint