Re: Page replacement algorithm in buffer cache

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Ants Aasma <ants(at)cybertec(dot)at>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Page replacement algorithm in buffer cache
Date: 2013-03-22 20:22:16
Message-ID: CAHyXU0wKmB5WXS+hAt2QmGfpTNdsz4Dx1nR3GCKvsNBOzt2ZbQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 22, 2013 at 3:16 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
>> I think there is some very low hanging optimization fruit in the clock
>> sweep loop. first and foremost, I see no good reason why when
>> scanning pages we have to spin and wait on a buffer in order to
>> pedantically adjust usage_count. some simple refactoring there could
>> set it up so that a simple TAS (or even a TTAS with the first test in
>> front of the cache line lock as we done automatically in x86 IIRC)
>> could guard the buffer and, in the event of any lock detected, simply
>> move on to the next candidate without messing around with that buffer
>> at all. This could construed as a 'trylock' variant of a spinlock
>> and might help out with cases where an especially hot buffer is
>> locking up the sweep. This is exploiting the fact that from
>> StrategyGetBuffer we don't need a *particular* buffer, just *a*
>> buffer.
>
> Hm. You could argue in fact that if there's contention for the buffer
> header, that's proof that it's busy and shouldn't have its usage count
> decremented. So this seems okay from a logical standpoint.
>
> However, I'm not real sure that it's possible to do a conditional
> spinlock acquire that doesn't create just as much hardware-level
> contention as a full acquire (ie, TAS is about as bad whether it
> gets the lock or not). So the actual benefit is a bit less clear.

well if you do a non-locking test first you could at least avoid some
cases (and, if you get the answer wrong, so what?) by jumping to the
next buffer immediately. if the non locking test comes good, only
then do you do a hardware TAS.

you could in fact go further and dispense with all locking in front of
usage_count, on the premise that it's only advisory and not a real
refcount. so you only then lock if/when it's time to select a
candidate buffer, and only then when you did a non locking test first.
this would of course require some amusing adjustments to various
logical checks (usage_count <= 0, heh).

merlin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Farina 2013-03-22 20:37:34 Re: postgres_fdw vs data formatting GUCs (was Re: [v9.3] writable foreign tables)
Previous Message Tom Lane 2013-03-22 20:16:18 Re: Page replacement algorithm in buffer cache