Re: Page replacement algorithm in buffer cache

From: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Jim Nasby <jim(at)nasby(dot)net>, Andres Freund <andres(at)2ndquadrant(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Ants Aasma <ants(at)cybertec(dot)at>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <stark(at)mit(dot)edu>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Page replacement algorithm in buffer cache
Date: 2013-04-02 18:45:56
Message-ID: 1045C03F-75B0-4A56-81B5-91D218DE4B98@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Sent from my iPad

On 02-Apr-2013, at 23:41, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:

> On Tue, Apr 2, 2013 at 12:50 PM, Atri Sharma <atri(dot)jiit(at)gmail(dot)com> wrote:
>> On Tue, Apr 2, 2013 at 9:24 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>
>>> One thought I had for fiddling with usage_count is to make it grow
>>> additively (x = x + 1) and decay exponentially (x = x >> 1). I'm not
>>> sure the idea is any good, but one problem with the current system is
>>> that it's pretty trivial for a buffer to accumulate five touches, and
>>> after that we lose all memory of what the frequency of access is, so a
>>> pages of varying different levels of "hotness" can all have usage
>>> count 5. This might allow a little more refinement without letting
>>> the time to degrade the usage count get out of control.
>>
>> This is just off the top of my head, but one possible solution could
>> be to quantize the levels of hotness. Specifically, we could
>> categorize buffers based on hotness. All buffers start in level 1 and
>> usage_count 0. Whichever buffer reaches usage_count of 5, and next
>> clock sweep which wants to increment its usage_count(hence taking it
>> above 5) sees that, it promotes the buffer to the next level, and
>> resets its usage_count to 0. Same logic applies for each level. When
>> we decrement usage_count and see that it is zero(for some buffer), if
>> it is in a level > 1, we demote the buffer to the next lower level. If
>> the buffer is in level 1, it is a potential candidate for replacement.
>>
>> This will allow us to have a loose idea about the hotness of a page,
>> without actually storing the usage_count for a buffer. We can still
>> update usage_count without locking, as buffers in high contention
>> which miss an update in their usage_count wont be affected by that
>> missed update, in accordance with all the discussion upthread.
>
> how is that different from usage_count itself? usage_count *is* a
> measure of hotness. the arbitrary cap at 5 is paranoia to prevent the
> already considerable damage that occurs in the situation Andres is
> talking about (where everyhing is marked 'hot' so you have to sweep a
> lot more).
>
> also, any added complexity in terms of manipulating usage_count is a
> move away from the lockless maintenance I'm proposing. maybe my idea
> is a non-starter on that basis alone, but the mechanic should be kept
> as simple as possible. the idea to move it to the bgwriter is to
> pre-emptively do the work that backends are now doing: try and keep
> ahead of the allocations being done so that buffer requests are
> satisfied quickly.
>

I agree, we want to reduce the complexity of usage_count.I was only musing on the point Robert raised, if we want to continue using usage_count and refine it for more accurate tracking of hotness of a buffer.

Regards,

Atri

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Florian Pflug 2013-04-02 18:46:00 Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
Previous Message Kevin Grittner 2013-04-02 18:40:19 Re: Drastic performance loss in assert-enabled build in HEAD