Re: Configuration Recommendations

From: Scott Carey <scott(at)richrelevance(dot)com>
To: "sthomas(at)peak6(dot)com" <sthomas(at)peak6(dot)com>, John Lister <john(dot)lister(at)kickstone(dot)co(dot)uk>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Configuration Recommendations
Date: 2012-05-03 22:16:54
Message-ID: CBC8503C.9B0B7%scott@richrelevance.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


On 4/25/12 2:29 PM, "Shaun Thomas" <sthomas(at)peak6(dot)com> wrote:

>On 04/25/2012 02:46 AM, John Lister wrote:
>
>> Hi, I'd be grateful if you could share any XFS performance tweaks as I'm
>> not entirely sure I'm getting the most out of my setup and any
>> additional guidance would be very helpful.
>
>Ok, I'll give this with a huge caveat: these settings came from lots of
>testing, both load and pgbench based. I'll explain as much as I can.

The configured file system read-ahead is also an important factor -- how
important is sequential scan performance? More read-ahead (up to a point)
biases your I/O for sequential throughput. The deadline scheduler is also
biased slightly for throughput, meaning it will sacrifice some random iops
in order to get a sequential scan out of the way.

We have a couple systems that have aged a long time on XFS and ext3. Over
time, XFS slaughters ext3. This is due primarily to one feature: online
defragmentation. our ext3 systems are so horribly fragmented that
sequential scans almost no longer exist. ext4 is supposed to be better at
preventing fragmentation, but there is no online defragmenter. After a
parallel restore, postgres is rather fragmented. XFS can correct that,
and disk throughput for sequential scans increases significantly after
defragmentation. We schedule defragmentation passes nightly, which do
not take long after the initial pass.

>
>For initializing the XFS filesystem, you can take advantage of a few
>settings that are pretty handy.
>
>* -d agcount=256 - Higher amount of allocation groups works better with
>multi-CPU systems. We used 256, but you'll want to do tests to confirm
>this. The point is that you can have several threads writing to the
>filesystem simultaneously.
>
>* -l lazy-count=1 - Log data is written more efficiently. Gives a
>measurable performance boost. Newer versions set this, but CentOS 5 has
>the default to 0. I'm not sure about CentOS 6. Just enable it. :)
>
>* -l version=2 - Forces the most recent version of the logging
>algorithm; allows a larger log buffer on mount. Since you're using
>CentOS, the default value is still probably 1, which you don't want.
>
>And then there are the mount options. These actually seemed to make more
>of an impact in our testing:
>
>* allocsize=256m - Database files are up to 1GB in size. To prevent
>fragmentation, always pre-allocate in 256MB chunks. In recent 3.0+
>kernels, this setting will result in phantom storage allocation as each
>file is initially allocated with 256MB until all references have exited
>memory. Due to aggressive Linux inode cache behavior, this may not
>happen for several hours. On 3.0 kernels, this setting should be
>removed. I think the 2.6.35 kernel had this backported, so *TEST THIS
>SETTING BEFORE USING IT!*
>
>* logbufs=8 - Forces more of the log buffer to remain in RAM, improving
>file deletion performance. Good for temporary files. XFS often gets
>knocked for file deletion performance, and this brings it way up. Not
>really an issue with PG usage, but handy anyway. See logbsize.
>
>* logbsize=256k - Larger log buffers keep track of modified files in
>memory for better performance. See logbufs.
>
>* noatime - Negates touching the disk for file accesses. Reduces disk IO.
>
>* attr2 - Opportunistic improvement in the way inline extended
>attributes are stored on-disk. Not strictly necessary, but handy.
>
>
>I'm hoping someone else will pipe in, because these settings are pretty
>"old" and based on a CentOS 5.5 setup. I haven't done any metrics on the
>newer kernels, but I have followed enough to know allocsize is dangerous
>on new systems.
>
>Your mileage may vary. :)
>
>--
>Shaun Thomas
>OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
>312-444-8534
>sthomas(at)peak6(dot)com
>
>______________________________________________
>
>See http://www.peak6.com/email_disclaimer/ for terms and conditions
>related to this email
>
>--
>Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
>To make changes to your subscription:
>http://www.postgresql.org/mailpref/pgsql-performance

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Albe Laurenz 2012-05-04 07:57:20 Re: Several optimization options (config/hardware)
Previous Message Scott Carey 2012-05-03 22:09:11 Re: Configuration Recommendations