Skip site navigation (1) Skip section navigation (2)

Experimental ARC implementation

From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Experimental ARC implementation
Date: 2003-11-03 15:33:21
Message-ID: 3FA67541.9020105@Yahoo.com (view raw or flat)
Thread:
Lists: pgsql-hackers
Attached is a first trial implementation of the Adaptive Replacement 
Cache (ARC). The patch is against CVS HEAD of 7.4.

The algorithm is based on what's described in these papers:

http://www.almaden.ibm.com/StorageSystems/autonomic_storage/ARC/arcfast.pdf
http://www.almaden.ibm.com/StorageSystems/autonomic_storage/ARC/oneup.pdf

kindly pointed to by Cuong Bui a few days ago.

To work with PostgreSQL and to mostly keep our differentiation between 
the buffer manager and the cache replacement policy I implemented it as 
a set of (hopefully) more strategy independent, separate calls that 
mostly replace freelist.c. The main problem why the code differs 
optically from the pseudo code found in oneup.pdf is, that it needs to 
be concurrency safe. In particular the underlying list structures can be 
changed by other backends during IO, and that happens half way through 
the algorithm.

The major change to the existing freelist is an additional layer of 
indirection. The original buffer descriptors now only hold a cache page 
and it's clean/dirty information and such. The new layer is the cache 
directory who's size is different from the number of data blocks in the 
buffer cache.

With a TPC-C like transaction profile and a buffer cache size that 
allows 10-15% of the database to stay resident, I am not able to see any 
dramatic change in behaviour or performance. At the maximum the 
difference could be called a "light relief". But that transaction 
profile is not a good mix compared to any real world application anyway, 
so I wouldn't give too much on that result. It's also possible that I 
made some mistakes in the implementation, so have a critical eye on that.

When playing with it please keep a few facts in mind. A 64MB cache 
(shared_buffers=8192) needs about 10 minutes of full transaction load to 
populate, in benchmark terms this is called ramp-up time. The algorithm 
is aiming at scan resistance, so a pure index access approach will not 
show any significant changes. Therefore, don't tell me that a 1x scale 
10 client by 1000 transactions pgbench run doesn't show any performance 
boost, it cannot.

I will follow up shortly with an approach that integrates Tom's delay 
mechanism plus my first READ_BY_VACUUM hack into one combined experiement.


Jan

-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck(at)Yahoo(dot)com #

Attachment: cache_strategy_ARC.74.diff
Description: text/plain (36.4 KB)

Responses

pgsql-hackers by date

Next:From: Neil ConwayDate: 2003-11-03 15:44:43
Subject: Re: adding support for posix_fadvise()
Previous:From: StephenDate: 2003-11-03 15:04:52
Subject: Re: Experimental patch for inter-page delay in VACUUM

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group