Re: shared_buffers advice

From: Scott Carey <scott(at)richrelevance(dot)com>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: Paul McGarry <paul(dot)mcgarry(at)gmail(dot)com>, pgsql-performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: shared_buffers advice
Date: 2010-03-15 17:58:16
Message-ID: 198A4D5C-A8F6-45F3-A69B-25A06874117D@richrelevance.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Mar 11, 2010, at 12:39 AM, Greg Smith wrote:

>
> Giving all the buffers to the database doesn't work for many reasons:
> -Need a bunch leftover for clients to use (i.e. work_mem)
> -Won't be enough OS cache for non-buffer data the database expects
> cached reads and writes will perform well onto (some of the non-database
> files it uses)
> -Database checkpoints will turn into a nightmare, because there will be
> so much more dirty data that could have been spooled regularly out to
> the OS and then to disk by backends that doesn't ever happen.
> -Not having enough writes for buffering backend writes means less chanes
> to do write combining and elevator seek sorting, which means average I/O
> will drop.
>
> The alternate idea is to make shared_buffers small. I see people
> happilly running away in the 128MB - 256MB range sometimes. The benefit
> over just using the default of <32MB is obvious, but you're already past
> a good bit of the diminishing marginal returns just by the 8X increase.
>

The DB usage pattern influences this sort of decision too. One that does large bulk inserts can prefer larger shared buffers, provided its bg_writer is tuned well (10GB - 16GB for a 64GB server).
Temp table usage benefits from it as well -- I believe that one created as "ON COMMIT DROP" has a better chance of not being written to the data disk before being dropped with more work_mem.
If you have a mixed read workload that has occasional very large sequential scans, you will want to make sure shared_buffers is large enough to hold the most important index and randomly accessed data.

Linux is more sensitive to letting sequential scans kick out data from page cache than Postgres.

----------
Lastly, a word of caution on Linux. Before the recent changes to memory accounting and paging (~ kernel 2.28 ish?). Shared_buffers are only accounted for in part of the equations for paging. On one hand, the system sees shared memory as available to be swapped out (even though it won't) and on the other hand it senses memory pressure from it. So if you for example, set shared_mem to 75% of your RAM the system will completely freak out and kswapd and other processes will go through long periods of 100% CPU utilization.
An example:
32GB RAM, 16GB shared_buffers, CentOS 5.4:
With the default os 'swappiness' of '60' the system will note that less than 60% is used by pagecache and favor swapping out postgres backends aggressively. If either by turning down the swappiness or opening enough processes to consume more RAM on the system (to ~ 80% or so) the kernel will start spending a LOT of CPU, often minutes at a time, trying to free up memory. From my understanding, it will keep searching the postgres shared_buffers space for pages to swap out even though it can't do so. So for example, there might be 16GB shared mem (which it won't page out), 10GB other process memory, and 6GB actual cahced files in page cache. It sees the ratio of 6GB files to 26GB processes and heavily favors attacking the 26GB -- but scans the whole set of process memory and finds all pages are recently used or can't be paged out.

Anyhow, the latest linux kernels claim to fix this, and Solaris/OpenSolaris or BSD's don't have this problem. On OpenSolaris there are some benchmarks out there that showing that 90% of memory allocated to shared_buffers can work well. On Linux, that is dangerous. Combine the poor memory management when there is a lot of shared memory with the fact that 50% is bad for double-buffering, and the Linux suggestion becomes the typical 'at least 128MB, but never more than 25% of RAM'.

> Improves keep coming as shared_buffers cache size increases for many
> workloads, but eventually you can expect to go to far if you try to push
> everything in there. Only question is whether that happens at 40%, 60%,
> or something higher.
>
> --
> Greg Smith 2ndQuadrant US Baltimore, MD
> PostgreSQL Training, Services and Support
> greg(at)2ndQuadrant(dot)com www.2ndQuadrant.us
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Scott Marlowe 2010-03-15 21:01:02 Re: shared_buffers advice
Previous Message VJK 2010-03-15 16:37:57 Re: Deleting bytea, autovacuum, and 8.2/8.4 differences