Re: Page replacement algorithm in buffer cache

From: Greg Stark <stark(at)mit(dot)edu>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Jim Nasby <jim(at)nasby(dot)net>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Ants Aasma <ants(at)cybertec(dot)at>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Page replacement algorithm in buffer cache
Date: 2013-04-03 23:48:45
Message-ID: CAM-w4HMbRyoDJfB1WB7ZT91HtD6ku8CYb8QN97QsUTEdG7ky9A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 3, 2013 at 3:00 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> The main hesitation I've had about actually implementing such a scheme
> is that I find it a bit unappealing to have a background process
> dedicated to just this. But maybe it could be combined with some of
> the other ideas presented here. Perhaps we should have one process
> that scans the buffer arena and populates the freelists; as a side
> effect, if it runs across a dirty buffer, it kicks it over to the
> process described in the previous paragraph (which could still, also,
> absorb requests from other backends using buffer access strategies).
> Then we'd end up with nothing that looks exactly like the background
> writer we have now, but maybe no one would miss it.
>

I think the general pattern of development has led in the opposite
direction. Every time there's been one daemon responsible for two things
it's ended up causing contorted code and difficult to tune behaviours and
we've ended up splitting the two.

In particular in this case it seems like an especially poor choice. In the
time one buffer write might take the entire freelist might empty out. I
could easily imagine this happening *every* time a write I/O happens. It
seems more likely that you'll need multiple processes running the buffer
cleaning to keep up with a decent I/O subsystem.

I'm still skeptical about the idea of a "freelist". That just seems like a
terrible point of contention. However perhaps that's because I'm picturing
an LRU linked list. Perhaps the right thing is to maintain a pool of
buffers in some less contention-prone data structure which lets each
backend pick buffers out more or less independently of the others.

--
greg

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2013-04-03 23:52:41 Re: corrupt pages detected by enabling checksums
Previous Message David E. Wheeler 2013-04-03 23:42:49 Re: CREATE EXTENSION BLOCKS