Skip site navigation (1) Skip section navigation (2)

Re: PG 8.3 and large shared buffer settings

From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Scott Carey <scott(at)richrelevance(dot)com>, Aidan Van Dyk <aidan(at)highrise(dot)ca>, Dan Sugalski <dan(at)sidhe(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: PG 8.3 and large shared buffer settings
Date: 2009-09-26 14:59:00
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-performance
On Fri, 25 Sep 2009, Jeff Janes wrote:

> Does it do this even if the block was already in shared_buffers?

Usually not.  The buffer ring algorithm is used to manage pages that are 
read in specifically to satisfy a sequential scan (there's a slightly 
different ring method used for VACUUM too).  If the buffer you need is 
already available and not "pinned" (locked by someone else), it's not read 
from disk again.  Instead, its usage count is incremently only if it's at 
zero (this doesn't count as a use unless it's about to be evicted as 
unused), and it's returned without being added to the ring.

There's a section about this ("Buffer Ring Replacement Strategy") in the 
source code:;a=blob_plain;f=src/backend/storage/buffer/README;hb=HEAD

The commit that added the feature is at;a=commit;h=ebf3d5b66360823edbdf5ac4f9a119506fccd4c0

The basic flow of this code is that backends ask for buffers using 
BufferAlloc, which then calls StrategyGetBuffer (where the ring list is 
managed) only if it doesn't first find the page in the buffer cache.  You 
get what you'd hope for here:  a sequential scan will use blocks when 
they're already available in the cache, while reading in less popular 
blocks that weren't cached into the temporary ring area.  There's always 
the OS cache backing the PostrgreSQL one to handle cases where the working 
set you're using is just a bit larger than shared_buffers.  The ring read 
requests may very well be satisfied by that too if there was a recent 
sequential scan the OS is still caching.

You can read a high-level summary of the algorithm used for ring 
management (with an intro to buffer management in general) in my "Inside 
the PostgreSQL Buffer Cache" presentation at on P10 "Optimizations 
for problem areas".  That doesn't specifically cover the "what if it's in 
the cache already?" case though.

* Greg Smith gsmith(at)gregsmith(dot)com Baltimore, MD

In response to


pgsql-performance by date

Next:From: Craig JamesDate: 2009-09-26 14:59:50
Subject: Re: Bad performance of SELECT ... where id IN (...)
Previous:From: Paul OoiDate: 2009-09-26 14:53:53
Subject: Re: Bad performance of SELECT ... where id IN (...)

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group