Re: Load distributed checkpoint

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Jim C(dot) Nasby" <jim(at)nasby(dot)net>
Cc: "ITAGAKI Takahiro" <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Load distributed checkpoint
Date: 2006-12-08 15:26:27
Message-ID: 45792FC3.EE98.0025.0@wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

>>> On Fri, Dec 8, 2006 at 1:13 AM, in message
<20061208071305(dot)GG44124(at)nasby(dot)net>,
"Jim C. Nasby" <jim(at)nasby(dot)net> wrote:
> On Thu, Dec 07, 2006 at 10:03:05AM - 0600, Kevin Grittner wrote:
>> We adjusted the background writer configuration
>> and nearly eliminated the problem.
>>
>> bgwriter_all_maxpages | 600
>> bgwriter_all_percent | 10
>> bgwriter_delay | 200
>> bgwriter_lru_maxpages | 200
>> bgwriter_lru_percent | 20
>
> Bear in mind that bgwriter settings should be considered in
conjunction
> with shared_buffer and checkpoint_timeout settings. For example, if
you
> have 60,000 shared buffers and a 300 second checkpoint interval,
those
> settings are going to be pretty aggressive.
>
> Generally, I try and configure the all* settings so that you'll get
1
> clock- sweep per checkpoint_timeout. It's worked pretty well, but I
don't
> have any actual tests to back that methodology up.

We have 20,000 shared buffers and a 300 second checkpoint interval.

We got to these numbers somewhat scientifically. I studied I/O
patterns under production load and figured we should be able to handle
about 800 writes in per 200 ms without causing problems. I have to
admit that I based the percentages and the ratio between "all" and "lru"
on gut feel after musing over the documentation.

Since my values were such a dramatic change from the default, I boosted
the production settings a little bit each day and looked for feedback
from our web team. Things improved with each incremental increase.
When I got to my calculated values (above) they reported that these
timeouts had dropped to an acceptable level -- a few per day on a
website with 2 million hits per day. We may benefit from further
adjustments, but since the problem is negligible with these settings,
there are bigger fish to fry at the moment.

By the way, if I remember correctly, these boxes have 256 MB battery
backed cache, while 20,000 buffers is 156.25 MB.

-Kevin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2006-12-08 15:42:06 Re: EXPLAIN ANALYZE
Previous Message Aaron Bono 2006-12-08 14:09:10 Re: [HACKERS] Case Preservation disregarding case

Browse pgsql-patches by date

  From Date Subject
Next Message Tom Lane 2006-12-08 16:43:27 Re: Load distributed checkpoint
Previous Message Inaam Rana 2006-12-08 12:17:37 Re: Load distributed checkpoint