*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
***************
*** 1036,1042 **** include 'filename'
usually require a corresponding increase in
checkpoint_segments, in order to spread out the
process of writing large quantities of new or changed data over a
! longer period of time.
--- 1036,1042 ----
usually require a corresponding increase in
checkpoint_segments, in order to spread out the
process of writing large quantities of new or changed data over a
! longer period of time. FIXME: What should we suggest here now?
***************
*** 1958,1974 **** include 'filename'
Checkpoints
!
! checkpoint_segments (integer)
! checkpoint_segments> configuration parameter
! Maximum number of log file segments between automatic WAL
! checkpoints (each segment is normally 16 megabytes). The default
! is three segments. Increasing this parameter can increase the
! amount of time needed for crash recovery.
This parameter can only be set in the postgresql.conf>
file or on the server command line.
--- 1958,1977 ----
Checkpoints
!
! checkpoint_wal_size (integer)
! checkpoint_wal_size> configuration parameter
! Maximum size to let the WAL grow to between automatic WAL
! checkpoints. This is a soft limit; WAL size can exceed
! checkpoint_wal_size> under special circumstances, like
! under heavy load, a failing archive_command>, or a high
! wal_keep_segments> setting. The default is 256 MB.
! Increasing this parameter can increase the amount of time needed for
! crash recovery.
This parameter can only be set in the postgresql.conf>
file or on the server command line.
***************
*** 2028,2033 **** include 'filename'
--- 2031,2054 ----
+
+ min_recycle_wal_size (integer)
+
+ min_recycle_wal_size> configuration parameter
+
+
+
+ As long as WAL disk usage stays below this setting, old WAL files are
+ always recycled for future use at a checkpoint, rather than removed.
+ This can be used to ensure that enough WAL space is reserved to
+ handle spikes in WAL usage, for example when running large batch
+ jobs. The default is 80 MB.
+ This parameter can only be set in the postgresql.conf>
+ file or on the server command line.
+
+
+
+
*** a/doc/src/sgml/perform.sgml
--- b/doc/src/sgml/perform.sgml
***************
*** 1302,1320 **** SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
!
! Increase checkpoint_segments
! Temporarily increasing the configuration variable can also
make large data loads faster. This is because loading a large
amount of data into PostgreSQL will
cause checkpoints to occur more often than the normal checkpoint
frequency (specified by the checkpoint_timeout
configuration variable). Whenever a checkpoint occurs, all dirty
pages must be flushed to disk. By increasing
! checkpoint_segments temporarily during bulk
data loads, the number of checkpoints that are required can be
reduced.
--- 1302,1320 ----
!
! Increase checkpoint_wal_size
! Increasing the configuration variable can also
make large data loads faster. This is because loading a large
amount of data into PostgreSQL will
cause checkpoints to occur more often than the normal checkpoint
frequency (specified by the checkpoint_timeout
configuration variable). Whenever a checkpoint occurs, all dirty
pages must be flushed to disk. By increasing
! checkpoint-wal-size temporarily during bulk
data loads, the number of checkpoints that are required can be
reduced.
***************
*** 1419,1425 **** SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
Set appropriate (i.e., larger than normal) values for
maintenance_work_mem and
! checkpoint_segments.
--- 1419,1425 ----
Set appropriate (i.e., larger than normal) values for
maintenance_work_mem and
! checkpoint_wal_size.
***************
*** 1486,1492 **** SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
So when loading a data-only dump, it is up to you to drop and recreate
indexes and foreign keys if you wish to use those techniques.
! It's still useful to increase checkpoint_segments
while loading the data, but don't bother increasing
maintenance_work_mem; rather, you'd do that while
manually recreating indexes and foreign keys afterwards.
--- 1486,1492 ----
So when loading a data-only dump, it is up to you to drop and recreate
indexes and foreign keys if you wish to use those techniques.
! It's still useful to increase checkpoint_wal_size
while loading the data, but don't bother increasing
maintenance_work_mem; rather, you'd do that while
manually recreating indexes and foreign keys afterwards.
***************
*** 1542,1548 **** SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
! Increase and ; this reduces the frequency
of checkpoints, but increases the storage requirements of
/pg_xlog>.
--- 1542,1548 ----
! Increase and ; this reduces the frequency
of checkpoints, but increases the storage requirements of
/pg_xlog>.
*** a/doc/src/sgml/wal.sgml
--- b/doc/src/sgml/wal.sgml
***************
*** 471,479 ****
The server's checkpointer process automatically performs
a checkpoint every so often. A checkpoint is begun every log segments, or every seconds, whichever comes first.
! The default settings are 3 segments and 300 seconds (5 minutes), respectively.
If no WAL has been written since the previous checkpoint, new checkpoints
will be skipped even if checkpoint_timeout> has passed.
(If WAL archiving is being used and you want to put a lower limit on how
--- 471,480 ----
The server's checkpointer process automatically performs
a checkpoint every so often. A checkpoint is begun every seconds, or if
! is about to be exceeded, whichever
! comes first.
! The default settings are 5 minutes and 256 MB, respectively.
If no WAL has been written since the previous checkpoint, new checkpoints
will be skipped even if checkpoint_timeout> has passed.
(If WAL archiving is being used and you want to put a lower limit on how
***************
*** 485,492 ****
! Reducing checkpoint_segments and/or
! checkpoint_timeout causes checkpoints to occur
more often. This allows faster after-crash recovery, since less work
will need to be redone. However, one must balance this against the
increased cost of flushing dirty data pages more often. If
--- 486,493 ----
! Reducing checkpoint_timeout and/or
! checkpoint_wal_size causes checkpoints to occur
more often. This allows faster after-crash recovery, since less work
will need to be redone. However, one must balance this against the
increased cost of flushing dirty data pages more often. If
***************
*** 509,519 ****
parameter. If checkpoints happen closer together than
checkpoint_warning> seconds,
a message will be output to the server log recommending increasing
! checkpoint_segments. Occasional appearance of such
a message is not cause for alarm, but if it appears often then the
checkpoint control parameters should be increased. Bulk operations such
as large COPY> transfers might cause a number of such warnings
! to appear if you have not set checkpoint_segments> high
enough.
--- 510,520 ----
parameter. If checkpoints happen closer together than
checkpoint_warning> seconds,
a message will be output to the server log recommending increasing
! checkpoint_wal_size. Occasional appearance of such
a message is not cause for alarm, but if it appears often then the
checkpoint control parameters should be increased. Bulk operations such
as large COPY> transfers might cause a number of such warnings
! to appear if you have not set checkpoint_wal_size> high
enough.
***************
*** 524,533 ****
, which is
given as a fraction of the checkpoint interval.
The I/O rate is adjusted so that the checkpoint finishes when the
! given fraction of checkpoint_segments WAL segments
! have been consumed since checkpoint start, or the given fraction of
! checkpoint_timeout seconds have elapsed,
! whichever is sooner. With the default value of 0.5,
PostgreSQL> can be expected to complete each checkpoint
in about half the time before the next checkpoint starts. On a system
that's very close to maximum I/O throughput during normal operation,
--- 525,534 ----
, which is
given as a fraction of the checkpoint interval.
The I/O rate is adjusted so that the checkpoint finishes when the
! given fraction of
! checkpoint_timeout seconds have elapsed, or before
! checkpoint_wal_size is exceeded, whichever is sooner.
! With the default value of 0.5,
PostgreSQL> can be expected to complete each checkpoint
in about half the time before the next checkpoint starts. On a system
that's very close to maximum I/O throughput during normal operation,
***************
*** 544,561 ****
! There will always be at least one WAL segment file, and will normally
! not be more than (2 + checkpoint_completion_target) * checkpoint_segments + 1
! or checkpoint_segments> + + 1
! files. Each segment file is normally 16 MB (though this size can be
! altered when building the server). You can use this to estimate space
! requirements for WAL.
! Ordinarily, when old log segment files are no longer needed, they
! are recycled (that is, renamed to become future segments in the numbered
! sequence). If, due to a short-term peak of log output rate, there
! are more than 3 * checkpoint_segments + 1
! segment files, the unneeded segment files will be deleted instead
! of recycled until the system gets back under this limit.
--- 545,577 ----
! The number of WAL segment files in pg_xlog> directory depends on
! checkpoint_wal_size>, wal_recycle_min_size> and the
! amount of WAL generated in previous checkpoint cycles. When old log
! segment files are no longer needed, they are removed or recycled (that is,
! renamed to become future segments in the numbered sequence). If, due to a
! short-term peak of log output rate, checkpoint_wal_size> is
! exceeded, the unneeded segment files will be removed until the system
! gets back under this limit. Below that limit, the system recycles enough
! WAL files to cover the estimated need until the next checkpoint, and
! removes the rest. The estimate is based on a moving average of the number
! of WAL files used in previous checkpoint cycles. The moving average
! is increased immediately if the actual usage exceeds the estimate, so it
! accommodates peak usage rather average usage to some extent.
! wal_recycle_min_size> puts a minimum on the amount of WAL files
! recycled for future usage; that much WAL is always recycled for future use,
! even if the system is idle and the WAL usage estimate suggests that little
! WAL is needed.
!
!
!
! Independently of checkpoint_wal_size,
! + 1 most recent WAL files are
! kept at all times. Also, if WAL archiving is used, old segments can not be
! removed or recycled until they are archived. If WAL archiving cannot keep up
! with the pace that WAL is generated, or if archive_command
! fails repeatedly, old WAL files will accumulate in pg_xlog>
! until the situation is resolved.
***************
*** 570,578 ****
master because restartpoints can only be performed at checkpoint records.
A restartpoint is triggered when a checkpoint record is reached if at
least checkpoint_timeout> seconds have passed since the last
! restartpoint. In standby mode, a restartpoint is also triggered if at
! least checkpoint_segments> log segments have been replayed
! since the last restartpoint.
--- 586,593 ----
master because restartpoints can only be performed at checkpoint records.
A restartpoint is triggered when a checkpoint record is reached if at
least checkpoint_timeout> seconds have passed since the last
! restartpoint, or if WAL size is about to exceed
! checkpoint_wal_size>.
*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
***************
*** 71,77 **** extern uint32 bootstrap_data_checksum_version;
/* User-settable parameters */
! int CheckPointSegments = 3;
int wal_keep_segments = 0;
int XLOGbuffers = -1;
int XLogArchiveTimeout = 0;
--- 71,78 ----
/* User-settable parameters */
! int checkpoint_wal_size = 262144; /* 256 MB */
! int min_recycle_wal_size = 81920; /* 80 MB */
int wal_keep_segments = 0;
int XLOGbuffers = -1;
int XLogArchiveTimeout = 0;
***************
*** 86,108 **** int CommitDelay = 0; /* precommit delay in microseconds */
int CommitSiblings = 5; /* # concurrent xacts needed to sleep */
int num_xloginsert_slots = 8;
#ifdef WAL_DEBUG
bool XLOG_DEBUG = false;
#endif
! /*
! * XLOGfileslop is the maximum number of preallocated future XLOG segments.
! * When we are done with an old XLOG segment file, we will recycle it as a
! * future XLOG segment as long as there aren't already XLOGfileslop future
! * segments; else we'll delete it. This could be made a separate GUC
! * variable, but at present I think it's sufficient to hardwire it as
! * 2*CheckPointSegments+1. Under normal conditions, a checkpoint will free
! * no more than 2*CheckPointSegments log segments, and we want to recycle all
! * of them; the +1 allows boundary cases to happen without wasting a
! * delete/create-segment cycle.
! */
! #define XLOGfileslop (2*CheckPointSegments + 1)
!
/*
* GUC support
--- 87,105 ----
int CommitSiblings = 5; /* # concurrent xacts needed to sleep */
int num_xloginsert_slots = 8;
+ /*
+ * Max distance from last checkpoint, before triggering a new xlog-based
+ * checkpoint.
+ */
+ int CheckPointSegments;
+
#ifdef WAL_DEBUG
bool XLOG_DEBUG = false;
#endif
! /* Estimated distance between checkpoints, in bytes */
! static double CheckPointDistanceEstimate = 0;
! static double PrevCheckPointDistance = 0;
/*
* GUC support
***************
*** 740,746 **** static void AdvanceXLInsertBuffer(XLogRecPtr upto, bool opportunistic);
static bool XLogCheckpointNeeded(XLogSegNo new_segno);
static void XLogWrite(XLogwrtRqst WriteRqst, bool flexible);
static bool InstallXLogFileSegment(XLogSegNo *segno, char *tmppath,
! bool find_free, int *max_advance,
bool use_lock);
static int XLogFileRead(XLogSegNo segno, int emode, TimeLineID tli,
int source, bool notexistOk);
--- 737,743 ----
static bool XLogCheckpointNeeded(XLogSegNo new_segno);
static void XLogWrite(XLogwrtRqst WriteRqst, bool flexible);
static bool InstallXLogFileSegment(XLogSegNo *segno, char *tmppath,
! bool find_free, XLogSegNo max_segno,
bool use_lock);
static int XLogFileRead(XLogSegNo segno, int emode, TimeLineID tli,
int source, bool notexistOk);
***************
*** 753,759 **** static bool WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
static int emode_for_corrupt_record(int emode, XLogRecPtr RecPtr);
static void XLogFileClose(void);
static void PreallocXlogFiles(XLogRecPtr endptr);
! static void RemoveOldXlogFiles(XLogSegNo segno, XLogRecPtr endptr);
static void UpdateLastRemovedPtr(char *filename);
static void ValidateXLOGDirectoryStructure(void);
static void CleanupBackupHistory(void);
--- 750,756 ----
static int emode_for_corrupt_record(int emode, XLogRecPtr RecPtr);
static void XLogFileClose(void);
static void PreallocXlogFiles(XLogRecPtr endptr);
! static void RemoveOldXlogFiles(XLogSegNo segno, XLogRecPtr PriorRedoPtr, XLogRecPtr endptr);
static void UpdateLastRemovedPtr(char *filename);
static void ValidateXLOGDirectoryStructure(void);
static void CleanupBackupHistory(void);
***************
*** 2548,2553 **** AdvanceXLInsertBuffer(XLogRecPtr upto, bool opportunistic)
--- 2545,2653 ----
}
/*
+ * Calculate CheckPointSegments based on checkpoint_wal_size and
+ * checkpoint_completion_target.
+ */
+ static void
+ CalculateCheckpointSegments(void)
+ {
+ double target;
+
+ /*-------
+ * Calculate the distance at which to trigger a checkpoint, to avoid
+ * exceeding checkpoint_wal_size. This is based on two assumptions:
+ *
+ * a) we keep WAL for two checkpoint cycles, back to the "prev" checkpoint.
+ * b) during checkpoint, we consume checkpoint_completion_target *
+ * number of segments consumed between checkpoints.
+ *-------
+ */
+ target = (double ) checkpoint_wal_size / (double) (XLOG_SEG_SIZE / 1024);
+ target = target / (2.0 + CheckPointCompletionTarget);
+
+ /* round down */
+ CheckPointSegments = (int) target;
+
+ if (CheckPointSegments < 1)
+ CheckPointSegments = 1;
+ }
+
+ void
+ assign_checkpoint_wal_size(int newval, void *extra)
+ {
+ checkpoint_wal_size = newval;
+ CalculateCheckpointSegments();
+ }
+
+ void
+ assign_checkpoint_completion_target(double newval, void *extra)
+ {
+ CheckPointCompletionTarget = newval;
+ CalculateCheckpointSegments();
+ }
+
+ /*
+ * At a checkpoint, how many WAL segments to recycle as preallocated future
+ * XLOG segments? Returns the highest segment that should be preallocated.
+ */
+ static XLogSegNo
+ XLOGfileslop(XLogRecPtr PriorRedoPtr)
+ {
+ double nsegments;
+ XLogSegNo minSegNo;
+ XLogSegNo maxSegNo;
+ double distance;
+ XLogSegNo recycleSegNo;
+
+ /*
+ * Calculate the segment numbers that min_recycle_wal_size and
+ * checkpoint_wal_size correspond to. Always recycle enough segments
+ * to meet the minimum, and remove enough segments to stay below the
+ * maximum.
+ */
+ nsegments = (double) min_recycle_wal_size / (double) (XLOG_SEG_SIZE / 1024);
+ minSegNo = PriorRedoPtr / XLOG_SEG_SIZE + (int) nsegments;
+ nsegments = (double) checkpoint_wal_size / (double) (XLOG_SEG_SIZE / 1024);
+ maxSegNo = PriorRedoPtr / XLOG_SEG_SIZE + (int) nsegments;
+
+ /*
+ * Between those limits, recycle enough segments to get us through to the
+ * estimated end of next checkpoint.
+ *
+ * To estimate where the next checkpoint will finish, assume that the
+ * system runs steadily consuming CheckPointDistanceEstimate
+ * bytes between every checkpoint.
+ *
+ * The reason this calculation is done from the prior checkpoint, not the
+ * one that just finished, is that this behaves better if some checkpoint
+ * cycles are abnormally short, like if you perform a manual checkpoint
+ * right after a timed one. The manual checkpoint will make almost a full
+ * cycle's worth of WAL segments available for recycling, because the
+ * segments from the prior's prior, fully-sized checkpoint cycle are no
+ * longer needed. However, the next checkpoint will make only few segments
+ * available for recycling, the ones generated between the timed
+ * checkpoint and the manual one right after that. If at the manual
+ * checkpoint we only retained enough segments to get us to the next timed
+ * one, and removed the rest, then at the next checkpoint we would not
+ * have enough segments around for recycling, to get us to the checkpoint
+ * after that. Basing the calculations on the distance from the prior redo
+ * pointer largely fixes that problem.
+ */
+ distance = (2.0 + CheckPointCompletionTarget) * CheckPointDistanceEstimate;
+ /* add 10% for good measure. */
+ distance *= 1.10;
+
+ recycleSegNo = (XLogSegNo) ceil(((double) PriorRedoPtr + distance) / XLOG_SEG_SIZE);
+
+ if (recycleSegNo < minSegNo)
+ recycleSegNo = minSegNo;
+ if (recycleSegNo > maxSegNo)
+ recycleSegNo = maxSegNo;
+
+ return recycleSegNo;
+ }
+
+ /*
* Check whether we've consumed enough xlog space that a checkpoint is needed.
*
* new_segno indicates a log file that has just been filled up (or read
***************
*** 3345,3351 **** XLogFileInit(XLogSegNo logsegno, bool *use_existent, bool use_lock)
char path[MAXPGPATH];
char tmppath[MAXPGPATH];
XLogSegNo installed_segno;
! int max_advance;
int fd;
bool zero_fill = true;
--- 3445,3451 ----
char path[MAXPGPATH];
char tmppath[MAXPGPATH];
XLogSegNo installed_segno;
! XLogSegNo max_segno;
int fd;
bool zero_fill = true;
***************
*** 3472,3480 **** XLogFileInit(XLogSegNo logsegno, bool *use_existent, bool use_lock)
* pre-create a future log segment.
*/
installed_segno = logsegno;
! max_advance = XLOGfileslop;
if (!InstallXLogFileSegment(&installed_segno, tmppath,
! *use_existent, &max_advance,
use_lock))
{
/*
--- 3572,3590 ----
* pre-create a future log segment.
*/
installed_segno = logsegno;
!
! /*
! * XXX: What should we use as max_segno? We used to use XLOGfileslop when
! * that was a constant, but that was always a bit dubious: normally, at a
! * checkpoint, XLOGfileslop was the offset from the checkpoint record,
! * but here, it was the offset from the insert location. We can't do the
! * normal XLOGfileslop calculation here because we don't have access to
! * the prior checkpoint's redo location. So somewhat arbitrarily, just
! * use CheckPointSegments.
! */
! max_segno = logsegno + CheckPointSegments;
if (!InstallXLogFileSegment(&installed_segno, tmppath,
! *use_existent, max_segno,
use_lock))
{
/*
***************
*** 3597,3603 **** XLogFileCopy(XLogSegNo destsegno, TimeLineID srcTLI, XLogSegNo srcsegno)
/*
* Now move the segment into place with its final name.
*/
! if (!InstallXLogFileSegment(&destsegno, tmppath, false, NULL, false))
elog(ERROR, "InstallXLogFileSegment should not have failed");
}
--- 3707,3713 ----
/*
* Now move the segment into place with its final name.
*/
! if (!InstallXLogFileSegment(&destsegno, tmppath, false, 0, false))
elog(ERROR, "InstallXLogFileSegment should not have failed");
}
***************
*** 3617,3638 **** XLogFileCopy(XLogSegNo destsegno, TimeLineID srcTLI, XLogSegNo srcsegno)
* number at or after the passed numbers. If FALSE, install the new segment
* exactly where specified, deleting any existing segment file there.
*
! * *max_advance: maximum number of segno slots to advance past the starting
! * point. Fail if no free slot is found in this range. On return, reduced
! * by the number of slots skipped over. (Irrelevant, and may be NULL,
! * when find_free is FALSE.)
*
* use_lock: if TRUE, acquire ControlFileLock while moving file into
* place. This should be TRUE except during bootstrap log creation. The
* caller must *not* hold the lock at call.
*
* Returns TRUE if the file was installed successfully. FALSE indicates that
! * max_advance limit was exceeded, or an error occurred while renaming the
* file into place.
*/
static bool
InstallXLogFileSegment(XLogSegNo *segno, char *tmppath,
! bool find_free, int *max_advance,
bool use_lock)
{
char path[MAXPGPATH];
--- 3727,3747 ----
* number at or after the passed numbers. If FALSE, install the new segment
* exactly where specified, deleting any existing segment file there.
*
! * max_segno: maximum segment number to install the new file as. Fail if no
! * free slot is found between *segno and max_segno. (Ignored when find_free
! * is FALSE.)
*
* use_lock: if TRUE, acquire ControlFileLock while moving file into
* place. This should be TRUE except during bootstrap log creation. The
* caller must *not* hold the lock at call.
*
* Returns TRUE if the file was installed successfully. FALSE indicates that
! * max_segno limit was exceeded, or an error occurred while renaming the
* file into place.
*/
static bool
InstallXLogFileSegment(XLogSegNo *segno, char *tmppath,
! bool find_free, XLogSegNo max_segno,
bool use_lock)
{
char path[MAXPGPATH];
***************
*** 3656,3662 **** InstallXLogFileSegment(XLogSegNo *segno, char *tmppath,
/* Find a free slot to put it in */
while (stat(path, &stat_buf) == 0)
{
! if (*max_advance <= 0)
{
/* Failed to find a free slot within specified range */
if (use_lock)
--- 3765,3771 ----
/* Find a free slot to put it in */
while (stat(path, &stat_buf) == 0)
{
! if ((*segno) >= max_segno)
{
/* Failed to find a free slot within specified range */
if (use_lock)
***************
*** 3664,3670 **** InstallXLogFileSegment(XLogSegNo *segno, char *tmppath,
return false;
}
(*segno)++;
- (*max_advance)--;
XLogFilePath(path, ThisTimeLineID, *segno);
}
}
--- 3773,3778 ----
***************
*** 3997,4010 **** UpdateLastRemovedPtr(char *filename)
/*
* Recycle or remove all log files older or equal to passed segno
*
! * endptr is current (or recent) end of xlog; this is used to determine
* whether we want to recycle rather than delete no-longer-wanted log files.
*/
static void
! RemoveOldXlogFiles(XLogSegNo segno, XLogRecPtr endptr)
{
XLogSegNo endlogSegNo;
! int max_advance;
DIR *xldir;
struct dirent *xlde;
char lastoff[MAXFNAMELEN];
--- 4105,4119 ----
/*
* Recycle or remove all log files older or equal to passed segno
*
! * endptr is current (or recent) end of xlog, and PriorRedoRecPtr is the
! * redo pointer of the previous checkpoint. These are used to determine
* whether we want to recycle rather than delete no-longer-wanted log files.
*/
static void
! RemoveOldXlogFiles(XLogSegNo segno, XLogRecPtr PriorRedoPtr, XLogRecPtr endptr)
{
XLogSegNo endlogSegNo;
! XLogSegNo recycleSegNo;
DIR *xldir;
struct dirent *xlde;
char lastoff[MAXFNAMELEN];
***************
*** 4016,4026 **** RemoveOldXlogFiles(XLogSegNo segno, XLogRecPtr endptr)
struct stat statbuf;
/*
! * Initialize info about where to try to recycle to. We allow recycling
! * segments up to XLOGfileslop segments beyond the current XLOG location.
*/
XLByteToPrevSeg(endptr, endlogSegNo);
! max_advance = XLOGfileslop;
xldir = AllocateDir(XLOGDIR);
if (xldir == NULL)
--- 4125,4134 ----
struct stat statbuf;
/*
! * Initialize info about where to try to recycle to.
*/
XLByteToPrevSeg(endptr, endlogSegNo);
! recycleSegNo = XLOGfileslop(PriorRedoPtr);
xldir = AllocateDir(XLOGDIR);
if (xldir == NULL)
***************
*** 4069,4088 **** RemoveOldXlogFiles(XLogSegNo segno, XLogRecPtr endptr)
* for example can create symbolic links pointing to a
* separate archive directory.
*/
! if (lstat(path, &statbuf) == 0 && S_ISREG(statbuf.st_mode) &&
InstallXLogFileSegment(&endlogSegNo, path,
! true, &max_advance, true))
{
ereport(DEBUG2,
(errmsg("recycled transaction log file \"%s\"",
xlde->d_name)));
CheckpointStats.ckpt_segs_recycled++;
/* Needn't recheck that slot on future iterations */
! if (max_advance > 0)
! {
! endlogSegNo++;
! max_advance--;
! }
}
else
{
--- 4177,4193 ----
* for example can create symbolic links pointing to a
* separate archive directory.
*/
! if (endlogSegNo <= recycleSegNo &&
! lstat(path, &statbuf) == 0 && S_ISREG(statbuf.st_mode) &&
InstallXLogFileSegment(&endlogSegNo, path,
! true, recycleSegNo, true))
{
ereport(DEBUG2,
(errmsg("recycled transaction log file \"%s\"",
xlde->d_name)));
CheckpointStats.ckpt_segs_recycled++;
/* Needn't recheck that slot on future iterations */
! endlogSegNo++;
}
else
{
***************
*** 7863,7869 **** LogCheckpointEnd(bool restartpoint)
elog(LOG, "restartpoint complete: wrote %d buffers (%.1f%%); "
"%d transaction log file(s) added, %d removed, %d recycled; "
"write=%ld.%03d s, sync=%ld.%03d s, total=%ld.%03d s; "
! "sync files=%d, longest=%ld.%03d s, average=%ld.%03d s",
CheckpointStats.ckpt_bufs_written,
(double) CheckpointStats.ckpt_bufs_written * 100 / NBuffers,
CheckpointStats.ckpt_segs_added,
--- 7968,7975 ----
elog(LOG, "restartpoint complete: wrote %d buffers (%.1f%%); "
"%d transaction log file(s) added, %d removed, %d recycled; "
"write=%ld.%03d s, sync=%ld.%03d s, total=%ld.%03d s; "
! "sync files=%d, longest=%ld.%03d s, average=%ld.%03d s; "
! "distance=%d KB, estimate=%d KB",
CheckpointStats.ckpt_bufs_written,
(double) CheckpointStats.ckpt_bufs_written * 100 / NBuffers,
CheckpointStats.ckpt_segs_added,
***************
*** 7874,7885 **** LogCheckpointEnd(bool restartpoint)
total_secs, total_usecs / 1000,
CheckpointStats.ckpt_sync_rels,
longest_secs, longest_usecs / 1000,
! average_secs, average_usecs / 1000);
else
elog(LOG, "checkpoint complete: wrote %d buffers (%.1f%%); "
"%d transaction log file(s) added, %d removed, %d recycled; "
"write=%ld.%03d s, sync=%ld.%03d s, total=%ld.%03d s; "
! "sync files=%d, longest=%ld.%03d s, average=%ld.%03d s",
CheckpointStats.ckpt_bufs_written,
(double) CheckpointStats.ckpt_bufs_written * 100 / NBuffers,
CheckpointStats.ckpt_segs_added,
--- 7980,7994 ----
total_secs, total_usecs / 1000,
CheckpointStats.ckpt_sync_rels,
longest_secs, longest_usecs / 1000,
! average_secs, average_usecs / 1000,
! (int) (PrevCheckPointDistance / 1024.0),
! (int) (CheckPointDistanceEstimate / 1024.0));
else
elog(LOG, "checkpoint complete: wrote %d buffers (%.1f%%); "
"%d transaction log file(s) added, %d removed, %d recycled; "
"write=%ld.%03d s, sync=%ld.%03d s, total=%ld.%03d s; "
! "sync files=%d, longest=%ld.%03d s, average=%ld.%03d s; "
! "distance=%d KB, estimate=%d KB",
CheckpointStats.ckpt_bufs_written,
(double) CheckpointStats.ckpt_bufs_written * 100 / NBuffers,
CheckpointStats.ckpt_segs_added,
***************
*** 7890,7896 **** LogCheckpointEnd(bool restartpoint)
total_secs, total_usecs / 1000,
CheckpointStats.ckpt_sync_rels,
longest_secs, longest_usecs / 1000,
! average_secs, average_usecs / 1000);
}
/*
--- 7999,8046 ----
total_secs, total_usecs / 1000,
CheckpointStats.ckpt_sync_rels,
longest_secs, longest_usecs / 1000,
! average_secs, average_usecs / 1000,
! (int) (PrevCheckPointDistance / 1024.0),
! (int) (CheckPointDistanceEstimate / 1024.0));
! }
!
! /*
! * Update the estimate of distance between checkpoints.
! *
! * The estimate is used to calculate the number of WAL segments to keep
! * preallocated, see XLOGFileSlop().
! */
! static void
! UpdateCheckPointDistanceEstimate(uint64 nbytes)
! {
! /*
! * To estimate the number of segments consumed between checkpoints, keep
! * a moving average of the actual number of segments consumed in previous
! * checkpoint cycles. However, if the load is bursty, with quiet periods
! * and busy periods, we want to cater for the peak load. So instead of a
! * plain moving average, let the average decline slowly if the previous
! * cycle used less WAL than estimated, but bump it up immediately if it
! * used more.
! *
! * When checkpoints are triggered by checkpoint_wal_size, this should
! * converge to CheckpointSegments * XLOG_SEG_SIZE,
! *
! * Note: This doesn't pay any attention to what caused the checkpoint.
! * Checkpoints triggered manually with CHECKPOINT command, or by e.g
! * starting a base backup, are counted the same as those created
! * automatically. The slow-decline will largely mask them out, if they are
! * not frequent. If they are frequent, it seems reasonable to count them
! * in as any others; if you issue a manual checkpoint every 5 minutes and
! * never let a timed checkpoint happen, it makes sense to base the
! * preallocation on that 5 minute interval rather than whatever
! * checkpoint_timeout is set to.
! */
! PrevCheckPointDistance = nbytes;
! if (CheckPointDistanceEstimate < nbytes)
! CheckPointDistanceEstimate = nbytes;
! else
! CheckPointDistanceEstimate =
! (0.90 * CheckPointDistanceEstimate + 0.10 * (double) nbytes);
}
/*
***************
*** 7932,7938 **** CreateCheckPoint(int flags)
XLogCtlInsert *Insert = &XLogCtl->Insert;
XLogRecData rdata;
uint32 freespace;
! XLogSegNo _logSegNo;
XLogRecPtr curInsert;
VirtualTransactionId *vxids;
int nvxids;
--- 8082,8088 ----
XLogCtlInsert *Insert = &XLogCtl->Insert;
XLogRecData rdata;
uint32 freespace;
! XLogRecPtr PriorRedoPtr;
XLogRecPtr curInsert;
VirtualTransactionId *vxids;
int nvxids;
***************
*** 8237,8246 **** CreateCheckPoint(int flags)
(errmsg("concurrent transaction log activity while database system is shutting down")));
/*
! * Select point at which we can truncate the log, which we base on the
! * prior checkpoint's earliest info.
*/
! XLByteToSeg(ControlFile->checkPointCopy.redo, _logSegNo);
/*
* Update the control file.
--- 8387,8396 ----
(errmsg("concurrent transaction log activity while database system is shutting down")));
/*
! * Remember the prior checkpoint's redo pointer, used later to determine
! * the point where the log can be truncated.
*/
! PriorRedoPtr = ControlFile->checkPointCopy.redo;
/*
* Update the control file.
***************
*** 8294,8304 **** CreateCheckPoint(int flags)
* Delete old log files (those no longer needed even for previous
* checkpoint or the standbys in XLOG streaming).
*/
! if (_logSegNo)
{
KeepLogSeg(recptr, &_logSegNo);
_logSegNo--;
! RemoveOldXlogFiles(_logSegNo, recptr);
}
/*
--- 8444,8460 ----
* Delete old log files (those no longer needed even for previous
* checkpoint or the standbys in XLOG streaming).
*/
! if (PriorRedoPtr != InvalidXLogRecPtr)
{
+ XLogSegNo _logSegNo;
+
+ /* Update the average distance between checkpoints. */
+ UpdateCheckPointDistanceEstimate(RedoRecPtr - PriorRedoPtr);
+
+ XLByteToSeg(PriorRedoPtr, _logSegNo);
KeepLogSeg(recptr, &_logSegNo);
_logSegNo--;
! RemoveOldXlogFiles(_logSegNo, PriorRedoPtr, recptr);
}
/*
***************
*** 8486,8492 **** CreateRestartPoint(int flags)
{
XLogRecPtr lastCheckPointRecPtr;
CheckPoint lastCheckPoint;
! XLogSegNo _logSegNo;
TimestampTz xtime;
/* use volatile pointer to prevent code rearrangement */
--- 8642,8648 ----
{
XLogRecPtr lastCheckPointRecPtr;
CheckPoint lastCheckPoint;
! XLogRecPtr PriorRedoPtr;
TimestampTz xtime;
/* use volatile pointer to prevent code rearrangement */
***************
*** 8554,8560 **** CreateRestartPoint(int flags)
/*
* Update the shared RedoRecPtr so that the startup process can calculate
* the number of segments replayed since last restartpoint, and request a
! * restartpoint if it exceeds checkpoint_segments.
*
* Like in CreateCheckPoint(), hold off insertions to update it, although
* during recovery this is just pro forma, because no WAL insertions are
--- 8710,8716 ----
/*
* Update the shared RedoRecPtr so that the startup process can calculate
* the number of segments replayed since last restartpoint, and request a
! * restartpoint if it exceeds CheckPointSegments.
*
* Like in CreateCheckPoint(), hold off insertions to update it, although
* during recovery this is just pro forma, because no WAL insertions are
***************
*** 8585,8594 **** CreateRestartPoint(int flags)
CheckPointGuts(lastCheckPoint.redo, flags);
/*
! * Select point at which we can truncate the xlog, which we base on the
! * prior checkpoint's earliest info.
*/
! XLByteToSeg(ControlFile->checkPointCopy.redo, _logSegNo);
/*
* Update pg_control, using current time. Check that it still shows
--- 8741,8750 ----
CheckPointGuts(lastCheckPoint.redo, flags);
/*
! * Remember the prior checkpoint's redo pointer, used later to determine
! * the point at which we can truncate the log.
*/
! PriorRedoPtr = ControlFile->checkPointCopy.redo;
/*
* Update pg_control, using current time. Check that it still shows
***************
*** 8615,8626 **** CreateRestartPoint(int flags)
* checkpoint/restartpoint) to prevent the disk holding the xlog from
* growing full.
*/
! if (_logSegNo)
{
XLogRecPtr receivePtr;
XLogRecPtr replayPtr;
TimeLineID replayTLI;
XLogRecPtr endptr;
/*
* Get the current end of xlog replayed or received, whichever is
--- 8771,8785 ----
* checkpoint/restartpoint) to prevent the disk holding the xlog from
* growing full.
*/
! if (PriorRedoPtr != InvalidXLogRecPtr)
{
XLogRecPtr receivePtr;
XLogRecPtr replayPtr;
TimeLineID replayTLI;
XLogRecPtr endptr;
+ XLogSegNo _logSegNo;
+
+ XLByteToSeg(PriorRedoPtr, _logSegNo);
/*
* Get the current end of xlog replayed or received, whichever is
***************
*** 8649,8655 **** CreateRestartPoint(int flags)
if (RecoveryInProgress())
ThisTimeLineID = replayTLI;
! RemoveOldXlogFiles(_logSegNo, endptr);
/*
* Make more log segments if needed. (Do this after recycling old log
--- 8808,8814 ----
if (RecoveryInProgress())
ThisTimeLineID = replayTLI;
! RemoveOldXlogFiles(_logSegNo, PriorRedoPtr, endptr);
/*
* Make more log segments if needed. (Do this after recycling old log
*** a/src/backend/postmaster/checkpointer.c
--- b/src/backend/postmaster/checkpointer.c
***************
*** 482,488 **** CheckpointerMain(void)
"checkpoints are occurring too frequently (%d seconds apart)",
elapsed_secs,
elapsed_secs),
! errhint("Consider increasing the configuration parameter \"checkpoint_segments\".")));
/*
* Initialize checkpointer-private variables used during
--- 482,488 ----
"checkpoints are occurring too frequently (%d seconds apart)",
elapsed_secs,
elapsed_secs),
! errhint("Consider increasing the configuration parameter \"checkpoint_wal_size\".")));
/*
* Initialize checkpointer-private variables used during
***************
*** 760,770 **** IsCheckpointOnSchedule(double progress)
return false;
/*
! * Check progress against WAL segments written and checkpoint_segments.
*
* We compare the current WAL insert location against the location
* computed before calling CreateCheckPoint. The code in XLogInsert that
! * actually triggers a checkpoint when checkpoint_segments is exceeded
* compares against RedoRecptr, so this is not completely accurate.
* However, it's good enough for our purposes, we're only calculating an
* estimate anyway.
--- 760,770 ----
return false;
/*
! * Check progress against WAL segments written and CheckPointSegments.
*
* We compare the current WAL insert location against the location
* computed before calling CreateCheckPoint. The code in XLogInsert that
! * actually triggers a checkpoint when CheckPointSegments is exceeded
* compares against RedoRecptr, so this is not completely accurate.
* However, it's good enough for our purposes, we're only calculating an
* estimate anyway.
*** a/src/backend/utils/misc/guc.c
--- b/src/backend/utils/misc/guc.c
***************
*** 1981,1996 **** static struct config_int ConfigureNamesInt[] =
},
{
! {"checkpoint_segments", PGC_SIGHUP, WAL_CHECKPOINTS,
! gettext_noop("Sets the maximum distance in log segments between automatic WAL checkpoints."),
! NULL
},
! &CheckPointSegments,
! 3, 1, INT_MAX,
NULL, NULL, NULL
},
{
{"checkpoint_timeout", PGC_SIGHUP, WAL_CHECKPOINTS,
gettext_noop("Sets the maximum time between automatic WAL checkpoints."),
NULL,
--- 1981,2008 ----
},
{
! {"min_recycle_wal_size", PGC_SIGHUP, WAL_CHECKPOINTS,
! gettext_noop("Sets the minimum size to shrink the WAL to."),
! NULL,
! GUC_UNIT_KB
},
! &min_recycle_wal_size,
! 81920, 32768, INT_MAX,
NULL, NULL, NULL
},
{
+ {"checkpoint_wal_size", PGC_SIGHUP, WAL_CHECKPOINTS,
+ gettext_noop("Sets the maximum WAL size that triggers a checkpoint."),
+ NULL,
+ GUC_UNIT_KB
+ },
+ &checkpoint_wal_size,
+ 262144, 32768, INT_MAX,
+ NULL, assign_checkpoint_wal_size, NULL
+ },
+
+ {
{"checkpoint_timeout", PGC_SIGHUP, WAL_CHECKPOINTS,
gettext_noop("Sets the maximum time between automatic WAL checkpoints."),
NULL,
***************
*** 2573,2579 **** static struct config_real ConfigureNamesReal[] =
},
&CheckPointCompletionTarget,
0.5, 0.0, 1.0,
! NULL, NULL, NULL
},
/* End-of-list marker */
--- 2585,2591 ----
},
&CheckPointCompletionTarget,
0.5, 0.0, 1.0,
! NULL, assign_checkpoint_completion_target, NULL
},
/* End-of-list marker */
*** a/src/include/access/xlog.h
--- b/src/include/access/xlog.h
***************
*** 181,187 **** extern XLogRecPtr XactLastRecEnd;
extern bool reachedConsistency;
/* these variables are GUC parameters related to XLOG */
! extern int CheckPointSegments;
extern int wal_keep_segments;
extern int XLOGbuffers;
extern int XLogArchiveTimeout;
--- 181,188 ----
extern bool reachedConsistency;
/* these variables are GUC parameters related to XLOG */
! extern int min_recycle_wal_size;
! extern int checkpoint_wal_size;
extern int wal_keep_segments;
extern int XLOGbuffers;
extern int XLogArchiveTimeout;
***************
*** 192,197 **** extern bool fullPageWrites;
--- 193,200 ----
extern bool log_checkpoints;
extern int num_xloginsert_slots;
+ extern int CheckPointSegments;
+
/* WAL levels */
typedef enum WalLevel
{
***************
*** 319,324 **** extern bool CheckPromoteSignal(void);
--- 322,330 ----
extern void WakeupRecovery(void);
extern void SetWalWriterSleeping(bool sleeping);
+ extern void assign_checkpoint_wal_size(int newval, void *extra);
+ extern void assign_checkpoint_completion_target(double newval, void *extra);
+
/*
* Starting/stopping a base backup
*/