Re: Memory Allocation

From: Scott Carey <scott(at)richrelevance(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Ryan Hansen <ryan(dot)hansen(at)brightbuilders(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Memory Allocation
Date: 2008-11-26 23:52:44
Message-ID: BDFBB77C9E07BE4A984DAAE981D19F961ACA1F1AED@EXVMBX018-1.exch018.msoutlookonline.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Swappiness optimization is going to vary. Definitely test on your own.

For a bulk load database, with large page cache, swappines = 60 (default) is _GUARANTEED_ to force the OS to swap out some of Postgres while in heavy use. This is heavily dependent on the page cache size, work_mem size, and concurrency.
I've had significantly increased performance setting this value low (1000x ! -- if your DB starts swapping postgres, you're performance-DEAD). The default has the OS targeting close to 60% of the memory for page cache. On a 32GB server, with 7GB postgres buffer cache, several concurrent queries reading GB's of data and using 500MB + work_mem (huge aggregates), the default swappiness will choose to page out postgres with about 19GB of disk page cache left to evict, with disastrous results. And that is a read-only test. Tests with writes can trigger it earlier if combined with bad dirty_buffers settings.

The root of the problem is that the Linux paging algorithm estimates that I/O for file read access is as costly as I/O for paging. A reasonable assumption for a desktop, a ridiculously false assumption for a large database with high capacity DB file I/O and a much lower capability swap file. Not only that -- page in is almost always near pure random reads, but DB I/O is often sequential. So losing 100M of cached db file takes a lot less time to scan back in than 100MB of the application.

If you do have enough other applications that are idle that take up RAM that should be pushed out to disk from time to time (perhaps your programs that are doing the bulk loading?) a higher value is useful. Although it is not exact, think of the swappiness value as the percentage of RAM that the OS would prefer page cache to applications (very roughly).

The more RAM you have and the larger your postgres memory usage, the lower the swappiness value should be. 60% of 24GB is ~14.5GB, If you have that much stuff that is in RAM that should be paged out to save space, try it.

I currently use a value of 1, on a 32GB machine, and about 600MB of 'stuff' gets paged out normally, 1400MB under heavy load. This is a dedicated machine. Higher values page out more stuff that increases the cache size and helps performance a little, but under the heavy load, it hits the paging wall and falls over. The small improvement in performance when the system is not completely stressed is not worth risking hitting the wall for me.

***For a bulk load database, one is optimizing for _writes_ and extra page cache doesn't help writes like it does reads.***

When I use a machine with misc. other lower priority apps and less RAM, I have found larger values to be helpful.

If your DB is configured with a low shared_buffers and small work_mem, you probably want the OS to use that much memory for disk pages, and again a higher swappiness may be more optimal.

Like all of these settings, tune to your application and test. Many of these settings are things that go hand in hand with others, but alone don't make as much sense. Tuning Postgres to do most of the caching and making the OS get out of the way is far different than tuning the OS to do as much caching work as possible and minimizing postgres. Which of those two strategies is best is highly application dependent, somewhat O/S dependent, and also hardware dependent.

-----Original Message-----
From: Kevin Grittner [mailto:Kevin(dot)Grittner(at)wicourts(dot)gov]
Sent: Wednesday, November 26, 2008 3:09 PM
To: Ryan Hansen; pgsql-performance(at)postgresql(dot)org; Scott Carey
Subject: Re: [PERFORM] Memory Allocation

>>> Scott Carey <scott(at)richrelevance(dot)com> wrote:
> Set swappiness to 0 or 1.

We recently converted all 72 remote county databases from 8.2.5 to
8.3.4. In preparation we ran a test conversion of a large county over
and over with different settings to see what got us the best
performance. Setting swappiness below the default degraded
performance for us in those tests for identical data, same hardware,
no other changes.

Our best guess is that code which really wasn't getting called got
swapped out leaving more space in the OS cache, but that's just a
guess. Of course, I'm sure people would not be recommending it if
they hadn't done their own benchmarks to confirm that this setting
actually improved things in their environments, so the lesson here is
to test for your environment when possible.

-Kevin

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Gregory Stark 2008-11-27 00:48:14 Re: Partition table query performance
Previous Message Kevin Grittner 2008-11-26 23:09:15 Re: Memory Allocation