Quick Links

Re: Page replacement algorithm in buffer cache

From:	Merlin Moncure <mmoncure(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Ants Aasma <ants(at)cybertec(dot)at>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Page replacement algorithm in buffer cache
Date:	2013-03-22 20:22:16
Message-ID:	CAHyXU0wKmB5WXS+hAt2QmGfpTNdsz4Dx1nR3GCKvsNBOzt2ZbQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Fri, Mar 22, 2013 at 3:16 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
>> I think there is some very low hanging optimization fruit in the clock
>> sweep loop. first and foremost, I see no good reason why when
>> scanning pages we have to spin and wait on a buffer in order to
>> pedantically adjust usage_count. some simple refactoring there could
>> set it up so that a simple TAS (or even a TTAS with the first test in
>> front of the cache line lock as we done automatically in x86 IIRC)
>> could guard the buffer and, in the event of any lock detected, simply
>> move on to the next candidate without messing around with that buffer
>> at all. This could construed as a 'trylock' variant of a spinlock
>> and might help out with cases where an especially hot buffer is
>> locking up the sweep. This is exploiting the fact that from
>> StrategyGetBuffer we don't need a *particular* buffer, just *a*
>> buffer.
>
> Hm. You could argue in fact that if there's contention for the buffer
> header, that's proof that it's busy and shouldn't have its usage count
> decremented. So this seems okay from a logical standpoint.
>
> However, I'm not real sure that it's possible to do a conditional
> spinlock acquire that doesn't create just as much hardware-level
> contention as a full acquire (ie, TAS is about as bad whether it
> gets the lock or not). So the actual benefit is a bit less clear.

well if you do a non-locking test first you could at least avoid some
cases (and, if you get the answer wrong, so what?) by jumping to the
next buffer immediately. if the non locking test comes good, only
then do you do a hardware TAS.

you could in fact go further and dispense with all locking in front of
usage_count, on the premise that it's only advisory and not a real
refcount. so you only then lock if/when it's time to select a
candidate buffer, and only then when you did a non locking test first.
this would of course require some amusing adjustments to various
logical checks (usage_count <= 0, heh).

merlin

In response to

Re: Page replacement algorithm in buffer cache at 2013-03-22 20:16:18 from Tom Lane

Responses

Re: Page replacement algorithm in buffer cache at 2013-03-23 00:27:38 from Ants Aasma

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Daniel Farina	2013-03-22 20:37:34	Re: postgres_fdw vs data formatting GUCs (was Re: [v9.3] writable foreign tables)
Previous Message	Tom Lane	2013-03-22 20:16:18	Re: Page replacement algorithm in buffer cache