The shared buffers challenge

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: postgres performance list <pgsql-performance(at)postgresql(dot)org>
Subject: The shared buffers challenge
Date: 2011-05-26 14:31:59
Message-ID: BANLkTimPC-K_o8XNn1hK4ZgBrTGO_Z6RDw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hello performers, I've long been unhappy with the standard advice
given for setting shared buffers. This includes the stupendously
vague comments in the standard documentation, which suggest certain
settings in order to get 'good performance'. Performance of what?
Connection negotiation speed? Not that it's wrong necessarily, but
ISTM too much based on speculative or anecdotal information. I'd like
to see the lore around this setting clarified, especially so we can
refine advice to: 'if you are seeing symptoms x,y,z set shared_buffers
from a to b to get symptom reduction of k'. I've never seen a
database blow up from setting them too low, but over the years I've
helped several people with bad i/o situations or outright OOM
conditions from setting them too high.

My general understanding of shared_buffers is that they are a little
bit faster than filesystem buffering (everything these days is
ultimately based on mmap AIUI, so there's no reason to suspect
anything else). Where they are most helpful is for masking of i/o if
a page gets dirtied >1 times before it's written out to the heap, but
seeing any benefit from that at all is going to be very workload
dependent. There are also downsides using them instead of on the heap
as well, and the amount of buffers you have influences checkpoint
behavior. So things are complex.

So, the challenge is this: I'd like to see repeatable test cases that
demonstrate regular performance gains > 20%. Double bonus points for
cases that show gains > 50%. No points given for anecdotal or
unverifiable data. Not only will this help raise the body of knowledge
regarding the setting, but it will help produce benchmarking metrics
against which we can measure multiple interesting buffer related
patches in the pipeline. Anybody up for it?

merlin

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Kevin Grittner 2011-05-26 14:48:46 Re: Hash Anti Join performance degradation
Previous Message Cédric Villemain 2011-05-26 14:21:21 Re: Hash Anti Join performance degradation