From: | Michael Paquier <michael(dot)paquier(at)gmail(dot)com> |
---|---|
To: | Simon Riggs <simon(at)2ndquadrant(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Remove secondary checkpoint |
Date: | 2017-10-24 22:17:10 |
Message-ID: | CAB7nPqQszNXin-9=97V3=uS2Gt_bx1AFM4r_uEhDLHobs35aoQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Oct 23, 2017 at 11:20 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> Remove the code that maintained two checkpoint's WAL files and all
> associated stuff.
>
> Try to avoid breaking anything else
>
> This
> * reduces disk space requirements on master
> * removes a minor bug in fast failover
> * simplifies code
In short, yeah! I had a close look at the patch and noticed a couple of issues.
+ else
/*
- * The last valid checkpoint record required for a streaming
- * recovery exists in neither standby nor the primary.
+ * We used to attempt to go back to a secondary checkpoint
+ * record here, but only when not in standby_mode. We now
+ * just fail if we can't read the last checkpoint because
+ * this allows us to simplify processing around checkpoints.
*/
ereport(PANIC,
(errmsg("could not locate a valid checkpoint record")));
- }
Using brackets in this case is more elegant style IMO. OK, there is
one line, but the comment is long so the block becomes
confusingly-shaped.
/* Version identifier for this pg_control format */
-#define PG_CONTROL_VERSION 1003
+#define PG_CONTROL_VERSION 1100
Yes, this had to happen anyway in this release cycle.
backup.sgml describes the following:
to higher segment numbers. It's assumed that segment files whose
contents precede the checkpoint-before-last are no longer of
interest and can be recycled.
However this is not true anymore with this patch.
The documentation of pg_control_checkpoint() is not updated. func.sgml
lists the tuple fields returned for this function.
You forgot to update pg_control_checkpoint in pg_proc.h. prior_lsn is
still listed in the list of arguments returned. And you need to update
as well the output list of types.
* whichChkpt identifies the checkpoint (merely for reporting purposes).
- * 1 for "primary", 2 for "secondary", 0 for "other" (backup_label)
+ * 1 for "primary", 0 for "other" (backup_label)
+ * 2 for "secondary" is no longer supported
*/
I would suggest to just remove the reference to "secondary".
- * Delete old log files (those no longer needed even for previous
- * checkpoint or the standbys in XLOG streaming).
+ * Delete old log files and recycle them
*/
Here that's more "Delete or recycle old log files". Recycling of a WAL
segment is its renaming into a newer segment.
You forgot to update this formula in xlog.c:
distance = (2.0 + CheckPointCompletionTarget) * CheckPointDistanceEstimate;
/* add 10% for good measure. */
distance *= 1.10;
You can guess easily where the mistake is.
- checkPointLoc = ControlFile->prevCheckPoint;
+ /*
+ * It appears to be a bug that we used to use
prevCheckpoint here
+ */
+ checkPointLoc = ControlFile->checkPoint;
Er, no. This is correct because we expect the prior checkpoint to
still be present in the event of a failure detecting the latest
checkpoint.
--
Michael
From | Date | Subject | |
---|---|---|---|
Next Message | Justin Pryzby | 2017-10-24 22:27:52 | Re: unique index violation after pg_upgrade to PG10 |
Previous Message | Peter Geoghegan | 2017-10-24 22:05:11 | Re: pgsql: Fix traversal of half-frozen update chains |