Re: PG 8.3 and large shared buffer settings

From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Scott Carey <scott(at)richrelevance(dot)com>, Aidan Van Dyk <aidan(at)highrise(dot)ca>, Dan Sugalski <dan(at)sidhe(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: PG 8.3 and large shared buffer settings
Date: 2009-09-26 14:59:00
Message-ID: alpine.GSO.2.01.0909261006390.11378@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Fri, 25 Sep 2009, Jeff Janes wrote:

> Does it do this even if the block was already in shared_buffers?

Usually not. The buffer ring algorithm is used to manage pages that are
read in specifically to satisfy a sequential scan (there's a slightly
different ring method used for VACUUM too). If the buffer you need is
already available and not "pinned" (locked by someone else), it's not read
from disk again. Instead, its usage count is incremently only if it's at
zero (this doesn't count as a use unless it's about to be evicted as
unused), and it's returned without being added to the ring.

There's a section about this ("Buffer Ring Replacement Strategy") in the
source code:
http://git.postgresql.org/gitweb?p=postgresql.git;a=blob_plain;f=src/backend/storage/buffer/README;hb=HEAD

The commit that added the feature is at
http://git.postgresql.org/gitweb?p=postgresql.git;a=commit;h=ebf3d5b66360823edbdf5ac4f9a119506fccd4c0

The basic flow of this code is that backends ask for buffers using
BufferAlloc, which then calls StrategyGetBuffer (where the ring list is
managed) only if it doesn't first find the page in the buffer cache. You
get what you'd hope for here: a sequential scan will use blocks when
they're already available in the cache, while reading in less popular
blocks that weren't cached into the temporary ring area. There's always
the OS cache backing the PostrgreSQL one to handle cases where the working
set you're using is just a bit larger than shared_buffers. The ring read
requests may very well be satisfied by that too if there was a recent
sequential scan the OS is still caching.

You can read a high-level summary of the algorithm used for ring
management (with an intro to buffer management in general) in my "Inside
the PostgreSQL Buffer Cache" presentation at
http://www.westnet.com/~gsmith/content/postgresql/ on P10 "Optimizations
for problem areas". That doesn't specifically cover the "what if it's in
the cache already?" case though.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Craig James 2009-09-26 14:59:50 Re: Bad performance of SELECT ... where id IN (...)
Previous Message Paul Ooi 2009-09-26 14:53:53 Re: Bad performance of SELECT ... where id IN (...)