On Mon, 2005-10-24 at 12:00, Craig A. James wrote:
> Kevin Grittner wrote:
> > In addition to what Mark pointed out, there is the possibility that a
> > query
> > is running which is scanning a large table or otherwise bringing in a
> > large number of pages from disk. That would first use up all available
> > unused cache space, and then may start replacing some of your
> > frequently used data.
> An LRU cache is often a bad strategy for database applications. There are two illustrations that show why.
> 1. You have an index that's used for EVERY search, but each search returns a large and unique set of rows. If it happens that the rows returned exceed the systems cache size, the part or all of your index will be flushed with EVERY query.
> 2. You do a sequential scan of a table that's one block bigger than the file system cache, then you do it again. At the beginning of the second scan, the first block of the table will have just been swapped out because it was the oldest, so the file system brings it back in, replacing the second block, which is now the oldest. As you scan the table, each block of the table is swapped out JUST BEFORE you get to it. At the start of your query, the file system might have had 99.9% of the relevant data in memory, but it swaps out 100% of it as your query progresses.
> Scenario 2 above is interesting because a system that is performing very well can suddenly experience a catastrophic performance decline when the size of the data exceeds a critical limit - the file system's avaliable cache.
> LRU works well if your frequently-used data is used often enough to keep it in memory. But many applications don't have that luxury. It's often the case that a single query will exceed the file system's cache size. The file system cache is "dumb" -- it's strategy is too simple for a relational database.
> What's needed is a way for the application developer to explicitely say, "This object is frequenly used, and I want it kept in memory."
There's an interesting conversation happening on the linux kernel
hackers mailing list right about now that applies:
In response to
pgsql-performance by date
|Next:||From: Craig A. James||Date: 2005-10-24 21:47:00|
|Subject: Re: Used Memory|
|Previous:||From: Rodrigo Madera||Date: 2005-10-24 19:11:53|
|Subject: Re: Inefficient escape codes.|