Re: Just-in-time Background Writer Patch+Test Results

From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Just-in-time Background Writer Patch+Test Results
Date: 2007-09-18 04:37:47
Message-ID: Pine.GSO.4.64.0709172352050.4502@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, 8 Sep 2007, Greg Smith wrote:

> Here's the results I got when I pushed the time down significantly from the
> defaults
> info | set | tps | cleaner_pct
> -----------------------------------------------+-----+------+-------------
> jit multiplier=1.0 scan_whole=120s delay=20ms | 20 | 956 | 92.34
> jit multiplier=2.0 scan_whole=120s delay=20ms | 21 | 967 | 99.94
>
> jit multiplier=1.5 scan_whole=120s delay=10ms | 22 | 944 | 97.91
> jit multiplier=2.0 scan_whole=120s delay=10ms | 23 | 981 | 99.7
> It seems I have to push the multiplier higher to get good results when using
> a much lower interval

Since I'm not exactly overwhelmed processing field reports, I've continued
this line of investigation myself...increasing the multiplier to 3.0 got
me another nine on the buffers written by the LRU BGW without a
significant change in performance:

info | set | tps | cleaner_pct
-----------------------------------------------+-----+------+-------------
jit multiplier=3.0 scan_whole=120s delay=10ms | 24 | 967 | 99.95

After thinking for a bit about why the 10ms case wasn't working so well
without a big multiplier, I considered that the default moving average
smoothing makes the sample period operating over such a short period of
time (10ms * 16=160ms) that it's unlikely to cover a typical pause that
one might want to smooth over. My initial thinking was to increase the
period of the smoothing so that it's of similar length to the default case
even when the interval goes down, but that didn't really improve anything
(note that the 16 case here is the default setup with just the delay at
10ms, which was a missing piece of data from the above as well--I only
tested with larger multipliers above at 10ms):

info | set | tps | cleaner_pct
----------------------------------------------+-----+------+-------------
jit multiplier=1.0 delay=10ms smoothing=16 | 27 | 982 | 89.4
jit multiplier=1.0 delay=10ms smoothing=64 | 26 | 946 | 89.55
jit multiplier=1.0 delay=10ms smoothing=320 | 25 | 970 | 89.53

What I realized is that after rounding the number of buffers to an
integer, dividing a very short period of activity by the smoothing
constant was resulting in the smoothing value usually dropping to 0 and
not doing much. This made me wonder how much the weighted average
smoothing was really doing in the default case. I put that code in months
ago and I hadn't looked recently at its effectiveness. Here's a
comparison:

info | set | tps | cleaner_pct
----------------------------------------------+-----+------+-------------
jit multiplier=1.0 delay=200ms smoothing=16 | 18 | 970 | 99.99
jit multiplier=1.0 delay=200ms smoothing=off | 28 | 957 | 97.16

All this data support my suggestion that the exact value of the smoothing
period constant isn't really a critical one. It appears moderately
helpful to have that logic on in some cases and the default value doesn't
seem to hurt the cases where I'd expect it to be the least effective.
Tuning the multiplier is much more powerful and useful than ever touching
this constant. I could probably even pull the smoothing logic out
altogether, at the cost of increasing the burden of correctly tuning the
multiplier on the administrator. So far it looks like it's reasonable
instead to leave it as an untunable to help the default configuration, and
I'll just add a documentation note that if you decrease the interval
you'll probably have to increase the multiplier.

After going through this, the extra data gives more useful baselines to do
a similar sensitivity analysis of the other item that's untunable in the
current patch:

float scan_whole_pool_seconds = 120.0;

But I'll be travelling for the next week and won't have time to look into
that myself until I get back.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Markus Schiltknecht 2007-09-18 09:57:17 Re: Raw device I/O for large objects
Previous Message Bruce Momjian 2007-09-18 03:57:53 Re: Open issues for HOT patch