Re: 2nd Level Buffer Cache

From: Jim Nasby <jim(at)nasby(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, rsmogura <rsmogura(at)softperience(dot)eu>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: 2nd Level Buffer Cache
Date: 2011-03-23 17:53:00
Message-ID: 26A0B7FC-369E-41D9-857A-84969A2C8998@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mar 22, 2011, at 2:53 PM, Robert Haas wrote:
> On Tue, Mar 22, 2011 at 11:24 AM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
>> On Fri, Mar 18, 2011 at 9:19 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>> On Fri, Mar 18, 2011 at 11:14 AM, Kevin Grittner
>>> <Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
>>>> Maybe the thing to focus on first is the oft-discussed "benchmark
>>>> farm" (similar to the "build farm"), with a good mix of loads, so
>>>> that the impact of changes can be better tracked for multiple
>>>> workloads on a variety of platforms and configurations. Without
>>>> something like that it is very hard to justify the added complexity
>>>> of an idea like this in terms of the performance benefit gained.
>>>
>>> A related area that could use some looking at is why performance tops
>>> out at shared_buffers ~8GB and starts to fall thereafter.
>>
>> Under what circumstances does this happen? Can a simple pgbench -S
>> with a large scaling factor elicit this behavior?
>
> To be honest, I'm mostly just reporting what I've heard Greg Smith say
> on this topic. I don't have any machine with that kind of RAM.

When we started using 192G servers we tried switching our largest OLTP database (would have been about 1.2TB at the time) from 8GB shared buffers to 28GB. Performance went down enough to notice; I don't have any solid metrics, but I'd ballpark it at 10-15%.

One thing that I've always wondered about is the logic of having backends run the clocksweep on a normal basis. OS's that use clock-sweep have a dedicated process to run the clock in the background, with the intent of keeping X amount of pages on the free list. We actually have most of the mechanisms to do that, we just don't have the added process. I believe bg_writer was intended to handle that, but in reality I don't think it actually manages to keep much of anything on the free list. Once we have a performance testing environment I'd be interested to test a modified version that includes a dedicated background clock sweep process that strives to keep X amount of buffers on the free list.
--
Jim C. Nasby, Database Architect jim(at)nasby(dot)net
512.569.9461 (cell) http://jim.nasby.net

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-03-23 18:22:33 Re: Re: making write location work (was: Efficient transaction-controlled synchronous replication)
Previous Message Guillaume Lelarge 2011-03-23 17:13:32 Re: Comments on SQL/Med objects