We've recently encountered some swapping issues on our CentOS 64GB Nehalem machine, running postgres 8.4.2. Unfortunately, I was foolish enough to set shared_buffers to 40GB. I was wondering if anyone would have any insight into why the swapping suddenly starts, but never recovers?
Note, the machine has been up and running since mid-December 2009. It was only a March 8 that this swapping began, and it's never recovered.
If we look at dstat, we find the following:
Note that it is constantly paging in, but never paging out. This would indicate that it's constantly reading from swap, but never writing out to it. Why would postgres do this? (postgres is pretty much the only thing running on this machine).
I'm planning on lowering the shared_buffers to a more sane value, like 25GB (pgtune recommends this for a Mixed-purpose machine) or less (pgtune recommends 14GB for an OLTP machine). However, before I do this (and possibly resolve the issue), I was hoping to see if anyone would have an explanation for the constant reading from swap, but never writing back.
pgsql-performance by date
|Next:||From: Scott Marlowe||Date: 2010-03-27 00:05:26|
|Subject: Re: why does swap not recover?|
|Previous:||From: Scott Marlowe||Date: 2010-03-26 14:00:38|
|Subject: Re: Why Wal_buffer is 64KB|