Re: increasing the default WAL segment size

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: increasing the default WAL segment size
Date: 2016-08-25 17:05:28
Message-ID: CABUevEyMb9yc4KW6xWzewk2awO5KekA_gr3X=gbn3ijci=CPuA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 25, 2016 at 6:59 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Thu, Aug 25, 2016 at 11:21 AM, Simon Riggs <simon(at)2ndquadrant(dot)com>
> wrote:
> > On 25 August 2016 at 02:31, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> >> Furthermore, there is an enforced, synchronous fsync at the end of
> >> every segment, which actually does hurt performance on write-heavy
> >> workloads.[2] Of course, if that were the only reason to consider
> >> increasing the segment size, it would probably make more sense to just
> >> try to push that extra fsync into the background, but that's not
> >> really the case. From what I hear, the gigantic number of files is a
> >> bigger pain point.
> >
> > I think we should fully describe the problem before finding a solution.
>
> Sure, that's usually a good idea. I attempted to outline all of the
> possible issues of which I am aware in my original email, but of
> course you may know of considerations which I overlooked.
>
> > This is too big a change to just tweak a value without discussing the
> > actual issue.
>
> Again, I tried to make sure I was discussing the actual issues in my
> original email. In brief: having to run archive_command multiple
> times per second imposes very tight latency requirements on it;
> directories with hundreds of thousands or millions of files are hard
> to manage; enforced synchronous fsyncs at the end of each segment hurt
> performance.
>
> > And if the problem is as described, how can a change of x4 be enough
> > to make it worth the pain of change? I think you're already admitting
> > it can't be worth it by discussing initdb configuration.
>
> I guess it depends on how much pain of change you think there will be.
> I would expect a change from 16MB -> 64MB to be fairly painless, but
> (1) it might break tools that aren't designed to cope with differing
> segment sizes and (2) it will increase disk utilization for people who
> have such low velocity systems that they never end up with more than 2
> WAL segments, and now those segments are bigger. If you know of other
> impacts or have reason to believe those problems will be serious,
> please fill in the details.
>
> Despite the fact that initdb configuration has dominated this thread,
> I mentioned it only in the very last sentence of my email and only as
> a possibility. I believe that a 4x change will be good enough for the
> majority of people for whom this is currently a pain point. However,
> yes, I do believe that there are some people for whom it won't be
> sufficient. And I believe that as we continue to enhance PostgreSQL
> to support higher and higher transaction rates, the number of people
> who need an extra-large WAL segment size will increase. As I see it,
> there are three options here:
>
> 1. Do nothing. So far, I don't see anybody arguing for that.
>
> 2. Change the default to 64MB and call it good. This idea seems to
> have considerable support.
>
> 3. Allow initdb-time configurability but keep the default at 16MB. I
> don't see any support for this. There is clearly support for
> configurability, but I don't see anyone arguing that the current
> default is preferable, unless that is what you are arguing.
>
> 4. Change the default to 64MB and also allow initdb-time
> configurability. This option also appears to enjoy substantial
> support, perhaps more than #2. Magnus seemed to be arguing that this
> is preferable to #2, because then it's easier for people to change the
> setting back if someone discovers a case where the higher default is a
> problem; Tom, on the other hand, seems to think this is overkill.

> Personally, I believe option #4 is for the best. I believe that the
> great majority of users will be better off with 64MB than with 16MB,
> but I like the idea of allowing for smaller values (for people with
> really low-velocity instances) and larger ones (for people with really
> high-velocity instances).
>

I was not arguing for #4 over #2, at least not strongly. I think #2 is
fine, and I think #4 are fine. #4 allows a way out, but it's not *that*
important unless we go *beyond* 64Mb.

I was mainly arguing that we can't claim "it has a configure switch so it's
kinda configurable" as a way out. If we want it configurable *at all*, it
should be an initdb switch. If we are confident in our defaults, it doesn't
have to be.

I agree that #4 is best. I'm not sure it's worth the cost. I'm not worried
at all about the risk of master/slave sync thing, per previous statement.
But if it does have performance implications, per Andres suggestion, then
making it configurable at initdb time probably comes with a cost that's not
worth paying.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2016-08-25 17:08:35 Re: UPSERT strange behavior
Previous Message Robert Haas 2016-08-25 17:04:33 Re: PG_DIAG_SEVERITY and a possible bug in pq_parse_errornotice()