Re: increasing the default WAL segment size

From: Beena Emerson <memissemerson(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: increasing the default WAL segment size
Date: 2017-01-27 08:47:51
Message-ID: CAOG9ApFC2eUPQsEZ9=0PnTmC1fPq_P45f1sR4TrkWG7eXxXY6w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello Andres,

Thank you for your review.

On Fri, Jan 27, 2017 at 12:39 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:

> Hi,
>
> On 2017-01-23 11:35:11 +0530, Beena Emerson wrote:
> > Please find attached an updated WIP patch. I have incorporated almost all
> > comments. This is to be applied over Robert's patches. I will post
> > performance results later on.
> >
> > 1. shift (>>) and AND (&) operations: The assign hook of wal_segment_size
> > sets the WalModMask and WalShiftBit. All the modulo and division
> operations
> > using XLogSegSize has been replaced with these. However, there are many
> > preprocessors which divide with XLogSegSize in xlog_internal.h. I have
> not
> > changed these because it would mean I will have to reassign the
> WalShiftBit
> > along with XLogSegSize in all the modules which use these macros. That
> does
> > not seem to be a good idea. Also, this means shift operator can be used
> > only in couple of places.
>
> I think it'd be better not to have XLogSegSize anymore. Silently
> changing a macros behaviour from being a compile time constant to
> something runtime configurable is a bad idea.
>

I dont think I understood u clearly. You mean convert the macros using
XLogSegSize to functions?

> > 8. Declaring XLogSegSize: There are 2 internal variables for the same
> > parameter. In original code XLOG_SEG_SIZE is defined in the
> auto-generated
> > file src/include/pg_config.h. And xlog_internal.h defines:
> >
> > #define XLogSegSize ((uint32) XLOG_SEG_SIZE)
> >
> > To avoid renaming all parts of code, I made the following change in
> > xlog_internal.h
> >
> > + extern uint32 XLogSegSize;
> >
> > +#define XLOG_SEG_SIZE XLogSegSize
> >
> > would it be better to just use one variable XLogSegSize everywhere. But
> > few external modules could be using XLOG_SEG_SIZE. Thoughts?
>
> They'll quite possibly break with configurable size anyway. So I'd
> rather have those broken explicitly.
>

Ok. I will remove the XLOG_SEGS_SIZE variable then?

> > +/*
> > + * These variables are set in assign_wal_segment_size
> > + *
> > + * WalModMask: It is an AND mask for XLogSegSize to allow for faster
> modulo
> > + * operations using it.
> > + *
> > + * WalShiftBit: It is an shift bit for XLogSegSize to allow for faster
> > + * division operations using it.
> > + *
> > + * UsableBytesInSegment: It is the number of bytes in a WAL segment
> usable for
> > + * WAL data.
> > + */
> > +uint32 WalModMask;
> > +static int UsableBytesInSegment;
> > +static int WalShiftBit;
>
> This could use some editorializing. "Faster modulo operations" isn't an
> explaining how/why it's actually being used. Same for WalShiftBit.
>

I will change these comments.

>
> > /*
> > * Private, possibly out-of-date copy of shared LogwrtResult.
> > @@ -957,6 +975,7 @@ XLogInsertRecord(XLogRecData *rdata,
> > if (!XLogInsertAllowed())
> > elog(ERROR, "cannot make new WAL entries during recovery");
> >
> > +
> > /*----------
> > *
>
> Spurious newline change.
>
> > if (ptr % XLOG_BLCKSZ == SizeOfXLogShortPHD &&
> > - ptr % XLOG_SEG_SIZE > XLOG_BLCKSZ)
> > + (ptr & WalModMask) > XLOG_BLCKSZ)
> > initializedUpto = ptr - SizeOfXLogShortPHD;
> > else if (ptr % XLOG_BLCKSZ == SizeOfXLogLongPHD &&
> > - ptr % XLOG_SEG_SIZE < XLOG_BLCKSZ)
> > + (ptr & WalModMask) < XLOG_BLCKSZ)
> > initializedUpto = ptr - SizeOfXLogLongPHD;
> > else
> > initializedUpto = ptr;
>
> How about we introduce a XLogSegmentOffset(XLogRecPtr) function like
> macro in a first patch? That'll reduce the amount of change in the
> commit actually changing things quite noticeably, and makes it easier to
> adjust things later. I see very little benefit for in-place usage of
> either % XLOG_SEG_SIZE or & WalModMask.
>

I will check this.

>
>
> > @@ -1794,6 +1813,7 @@ XLogBytePosToRecPtr(uint64 bytepos)
> > uint32 seg_offset;
> > XLogRecPtr result;
> >
> > +
> > fullsegs = bytepos / UsableBytesInSegment;
> > bytesleft = bytepos % UsableBytesInSegment;
>
> spurious change.
>
> > @@ -1878,7 +1898,7 @@ XLogRecPtrToBytePos(XLogRecPtr ptr)
> >
> > XLByteToSeg(ptr, fullsegs);
> >
> > - fullpages = (ptr % XLOG_SEG_SIZE) / XLOG_BLCKSZ;
> > + fullpages = (ptr & WalModMask) / XLOG_BLCKSZ;
> > offset = ptr % XLOG_BLCKSZ;
> >
> > if (fullpages == 0)
> > @@ -2043,7 +2063,7 @@ AdvanceXLInsertBuffer(XLogRecPtr upto, bool
> opportunistic)
> > /*
> > * If first page of an XLOG segment file, make it a long
> header.
> > */
> > - if ((NewPage->xlp_pageaddr % XLogSegSize) == 0)
> > + if ((NewPage->xlp_pageaddr & WalModMask) == 0)
> > {
> > XLogLongPageHeader NewLongPage =
> (XLogLongPageHeader) NewPage;
> >
> > @@ -2095,6 +2115,7 @@ CalculateCheckpointSegments(void)
> > * number of segments consumed between checkpoints.
> > *-------
> > */
> > +
> > target = (double) max_wal_size / (2.0 +
> CheckPointCompletionTarget);
>
> spurious change.
>
>
> > void
> > +assign_wal_segment_size(int newval, void *extra)
> > +{
> > + /*
> > + * During system initialization, XLogSegSize is not set so we use
> > + * DEFAULT_XLOG_SEG_SIZE instead.
> > + */
> > + int WalSegSize = (XLogSegSize == 0) ? DEFAULT_XLOG_SEG_SIZE :
> XLOG_SEG_SIZE;
> > +
> > + wal_segment_size = newval;
> > + UsableBytesInSegment = (wal_segment_size * UsableBytesInPage) -
> > + (SizeOfXLogLongPHD -
> SizeOfXLogShortPHD);
> > + WalModMask = WalSegSize - 1;
> > +
> > + /* Set the WalShiftBit */
> > + WalShiftBit = 0;
> > + while (WalSegSize > 1)
> > + {
> > + WalSegSize = WalSegSize >> 1;
> > + WalShiftBit++;
> > + }
> > +}
>
> Hm. Are GUC hooks a good way to compute the masks? Interdependent GUCs
> are unfortunately not working well, and several GUCs might end up
> depending on these. I think it might be better to assign the variables
> somewhere early in StartupXLOG() or such.
>

I am not sure about these interdependent GUCs. I need to study this better
and make changes as required.

> > +
> > +void
> > +assign_min_wal_size(int newval, void *extra)
> > +{
> > + /*
> > + * During system initialization, XLogSegSize is not set so we use
> > + * DEFAULT_XLOG_SEG_SIZE instead.
> > + *
> > + * min_wal_size is in kB and XLogSegSize is in bytes and so it is
> > + * converted to kB for the calculation.
> > + */
> > + int WalSegSize = (XLogSegSize == 0) ? (DEFAULT_XLOG_SEG_SIZE /
> 1024) :
> > +
> (XLOG_SEG_SIZE / 1024);
> > +
> > + min_wal_size = newval / WalSegSize;
> > +}
> > +
> > +void
> > assign_max_wal_size(int newval, void *extra)
> > {
> > - max_wal_size = newval;
> > + /*
> > + * During system initialization, XLogSegSize is not set so we use
> > + * DEFAULT_XLOG_SEG_SIZE instead.
> > + *
> > + * max_wal_size is in kB and XLogSegSize is in bytes and so it is
> > + * converted to bytes for the calculation.
> > + */
> > + int WalSegSize = (XLogSegSize == 0) ? (DEFAULT_XLOG_SEG_SIZE /
> 1024) :
> > +
> (XLOG_SEG_SIZE / 1024);
> > +
> > + max_wal_size = newval / WalSegSize;
> > CalculateCheckpointSegments();
> > }
>
> I don't think it's a good idea to have GUCs that are initially set to
> the wrong value and such. How about just storing bytes, and converting
> into segments upon use?
>

max_wal_size is used in CalculateCheckpointSegments and XLOGfileslop.
min_wal_size is used in XLOGfileslop only.

XLOGfileslop is called after the postgres has started up and would have
XLogSegSize set by then but CalculateCheckpointSegments would be a
problem. assign_max_wal_size calls CalculateCheckpointSegments which will
need the value as segment count not bytes. If we continue as bytes, then we
will need to shift the WalSegSize adjustment in the
CalculateCheckpointSegments.

> > @@ -2135,8 +2205,8 @@ XLOGfileslop(XLogRecPtr PriorRedoPtr)
> > * correspond to. Always recycle enough segments to meet the
> minimum, and
> > * remove enough segments to stay below the maximum.
> > */
> > - minSegNo = PriorRedoPtr / XLOG_SEG_SIZE + min_wal_size - 1;
> > - maxSegNo = PriorRedoPtr / XLOG_SEG_SIZE + max_wal_size - 1;
> > + minSegNo = (PriorRedoPtr >> WalShiftBit) + min_wal_size - 1;
> > + maxSegNo = (PriorRedoPtr >> WalShiftBit) + max_wal_size - 1;
>
> I think a macro would be good here too (same prerequisite patch as above).
>
> > @@ -4677,8 +4749,18 @@ XLOGShmemSize(void)
> > */
> > if (XLOGbuffers == -1)
> > {
> > - char buf[32];
> > -
> > + /*
> > + * The calculation of XLOGbuffers, requires the now
> run-time parameter
> > + * XLogSegSize from the ControlFile. The value determined
> here is
> > + * required to create the shared memory segment. Hence,
> temporarily
> > + * allocating space and reading ControlFile here.
> > + */
>
> I don't like comments containing things like "the now run-time paramter"
> much - they are likely going to still be there in 10 years, and will be
> hard to understand.
>

you are right.

>
> But anyway, how about we simply remove the "max one segment" boundary
> instead? I don't think it's actually very meaningful - several people
> posted benchmarks with more than one segment being beneficial.
>
>
> > diff --git a/src/bin/pg_basebackup/streamutil.c
> b/src/bin/pg_basebackup/streamutil.c
> > index 31290d3..87efc3c 100644
> > --- a/src/bin/pg_basebackup/streamutil.c
> > +++ b/src/bin/pg_basebackup/streamutil.c
> > @@ -238,6 +238,59 @@ GetConnection(void)
> > }
> >
> > /*
> > + * Run the SHOW_WAL_SEGMENT_SIZE command to set the XLogSegSize
> > + */
> > +bool
> > +SetXLogSegSize(PGconn *conn)
> > +{
>
> I think this is a confusing function name, because it sounds like
> you're setting the SegSize remotely or such. I think making it
> XLogRecPtr RetrieveXLogSegSize(conn); or such would lead to better code.
>

I agree. I will do the needful.

>
> > diff --git a/src/bin/pg_resetxlog/pg_resetxlog.c
> b/src/bin/pg_resetxlog/pg_resetxlog.c
> > index 963802e..4ceebdc 100644
> > --- a/src/bin/pg_resetxlog/pg_resetxlog.c
> > +++ b/src/bin/pg_resetxlog/pg_resetxlog.c
> > @@ -57,6 +57,7 @@
> > #include "storage/large_object.h"
> > #include "pg_getopt.h"
> >
> > +uint32 XLogSegSize;
>
> This seems like a bad idea - having the same local variable both in
> frontend and backend programs seems like a recipe for disaster.
>

I had to use the same variable name because they were used in the macros
specified in the xlog_internal.h, So re-assigning this variable would
automatically make the macros using XLogSegSize accessible in these
programs. Else they will throw error "undefined reference to `XLogSegSize'"

--
Thank you,

Beena Emerson

Have a Great Day!

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nikhil Sontakke 2017-01-27 09:59:07 Re: Speedup twophase transactions
Previous Message Andrew Dunstan 2017-01-27 08:35:37 Re: Failure in commit_ts tap tests