Skip site navigation (1) Skip section navigation (2)

Re: Bgwriter LRU cleaning: we've been going at this all wrong

From: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgreSQL(dot)org, Greg Smith <gsmith(at)gregsmith(dot)com>, ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Subject: Re: Bgwriter LRU cleaning: we've been going at this all wrong
Date: 2007-06-26 22:40:29
Message-ID: 468195DD.7070900@enterprisedb.com (view raw or flat)
Thread:
Lists: pgsql-hackers
Tom Lane wrote:
> I just had an epiphany, I think.
> 
> As I wrote in the LDC discussion,
> http://archives.postgresql.org/pgsql-patches/2007-06/msg00294.php
> if the bgwriter's LRU-cleaning scan has advanced ahead of freelist.c's
> clock sweep pointer, then any buffers between them are either clean,
> or are pinned and/or have usage_count > 0 (in which case the bgwriter
> wouldn't bother to clean them, and freelist.c wouldn't consider them
> candidates for re-use).  And *this invariant is not destroyed by the
> activities of other backends*.  A backend cannot dirty a page without
> raising its usage_count from zero, and there are no race cases because
> the transition states will be pinned.
> 
> This means that there is absolutely no point in having the bgwriter
> re-start its LRU scan from the clock sweep position each time, as
> it currently does.  Any pages it revisits are not going to need
> cleaning.  We might as well have it progress forward from where it
> stopped before.

All true this far.

Note that Itagaki-san's patch changes that though. With the patch, the 
LRU scan doesn't look for bgwriter_lru_maxpages dirty buffers to write. 
Instead, it checks that there's N (where N varies based on history) 
clean buffers with usage_count=0 in front of the clock sweep. If there 
isn't, it writes dirty buffers until there is again.

> In fact, the notion of the bgwriter's cleaning scan being "in front of"
> the clock sweep is entirely backward.  It should try to be behind the
> sweep, ie, so far ahead that it's lapped the clock sweep and is trailing
> along right behind it, cleaning buffers immediately after their
> usage_count falls to zero.  All the rest of the buffer arena is either
> clean or has positive usage_count.

Really? How much of the buffer cache do you think we should try to keep 
clean? And how large a percentage of the buffer cache do you think have 
usage_count=0 at any given point in time? I'm not sure myself, but as a 
data point the usage counts on a quick DBT-2 test on my laptop look like 
this:

  usagecount | count
------------+-------
           0 |  1107
           1 |  1459
           2 |   459
           3 |   235
           4 |   352
           5 |   481
             |     3

NBuffers = 4096.

That will vary widely depending on your workload, of course, but keeping 
1/4 of the buffer cache clean seems like overkill to me. If any of those 
buffers are re-dirtied after we write them, the write was a waste of time.

-- 
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

In response to

Responses

pgsql-hackers by date

Next:From: Heikki LinnakangasDate: 2007-06-26 22:58:36
Subject: Re: Bgwriter LRU cleaning: we've been going at this all wrong
Previous:From: Tom LaneDate: 2007-06-26 22:31:52
Subject: Re: Bgwriter LRU cleaning: we've been going at this all wrong

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group