Re: Clock sweep not caching enough B-Tree leaf pages?

From: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Ants Aasma <ants(at)cybertec(dot)at>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Clock sweep not caching enough B-Tree leaf pages?
Date: 2014-04-19 17:45:59
Message-ID: CAOeZVif3GD1M6vbidKezTZxu6kwbUG-Y0gQTJcvadRvdsCCeUA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Apr 19, 2014 at 3:37 AM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:

>
> > One thing that I discussed with Merlin offline and am now concerned
> about is
> > how will the actual eviction work. We cannot traverse the entire list
> and then
> > find all the buffers with refcount 0 and then do another traversal to
> find the
> > oldest one.
>
> I thought if there was memory pressure the clock sweep would run and we
> wouldn't have everything at the max counter access value.
>
>
Hmm, I see your point.

With that applicable as well, I feel that the clocksweep counting/logical
clock system shall be useful when deciding between multiple candidates for
eviction. At worst, it can serve to replace the gettimeofday() calls.

One thing I have thought of with ideas and inputs from Joshua Yanowski
offline is that we can probably have a maxheap which is on the logical
clock age of buffers. Each time clocksweep sees a buffer whose refcount has
become zero, it will push the buffer into minheap. This can be a new
representation of freelist or a new additional data structure.

This still does not solve the problem of seeing the entire list by the
clocksweep, even if that makes the eviction process O(1) with the addition
of the maxheap.

I am working on a PoC patch but am stuck on this point. My current approach
sees the entire shared buffers list to search for any candidate buffers.

Another thing that is a pain point here is the concurrency and locking
overheads of introducing a new data structure. Can the existing buffer
header spinlock handle this problem or is it hitting the granularity of the
spinlock too much?

I see some blockers for this idea still. Nevertheless, the point of
clocksweep counts as logical clocks seems to be promising,atleast
intuitively.

Thoughts and comments?

Regards,

Atri

--
Regards,

Atri
*l'apprenant*

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2014-04-19 18:35:33 Re: Re: [DOCS] Docs incorrectly claiming equivalence between show and pg_settings
Previous Message Tom Lane 2014-04-19 17:38:16 Re: Re: [DOCS] Docs incorrectly claiming equivalence between show and pg_settings