Re: Clock sweep not caching enough B-Tree leaf pages?

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Jim Nasby <jim(at)nasby(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Clock sweep not caching enough B-Tree leaf pages?
Date: 2014-04-24 22:26:17
Message-ID: CAM3SWZRxvNqE772bncSkeOJLO9GZ2zEv8LoR-ExKXKHU6h1YTg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Apr 21, 2014 at 11:57 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> Here is a benchmark that is similar to my earlier one, but with a rate
> limit of 125 tps, to help us better characterize how the prototype
> patch helps performance:
>
> http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/3-sec-delay-limit/

I've added some more test sets to this result report, again with a 125
TPS limit, but on this occasion with a pgbench Gaussian distribution.
I used V13 of the recently proposed Guassian distribution pgbench
patch [1] to accomplish this, including the Gaussian variant of tpc-b
that the pgbench patch has baked in. The distribution threshold used
was consistently 5, causing the patched pgbench to report for each
test:

transaction type: Custom query
scaling factor: 5000
standard deviation threshold: 5.00000
access probability of top 20%, 10% and 5% records: 0.68269 0.38293 0.19741

It looks like the patch continues to have much lower latency than
master for this somewhat distinct workload. Actually, even though the
background writer is somewhat working harder than in the uniform
distribution case, the average latency with patched is appreciably
lower. Total buffers allocated are just as consistent as before for
patched, but the number is markedly lower than for the prior uniform
distribution case. Dirty memory graphs start off similar to the
uniform case with patched, but get a bit spikier towards the end of
each test run there. It's still *markedly* better than master for
either distribution type, which is still really aggressive at times
for master, and other times by far isn't aggressive enough, in much
the same way as before.

In general, with the Gaussian distribution, average latency is lower,
but worst case is higher. The patch maintains its clear lead for
average case, albeit a smaller lead than with uniform, and with worst
case things are much better relatively speaking. Absolute worst case
(and not worst case averaged across client counts) is 1.4 seconds with
patched, to 8.3 with master...and that terrible worst case happens
*twice* with master. For uniform distribution, the same figure was 5.4
- 5.8 seconds for master, and 0.6 seconds for patched.

What is curious is that with master and with the Gaussian
distribution, I see distinct latency "no man's land" in multiple test
runs, like this one here for example:
http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/3-sec-delay-limit/49/index.html
. It looks like there is a clear differentiation between going to disk
and not going to disk, or something like that. I don't see this for
any other case, and it's quite obviously a consistent and distinct
feature of master + Gaussian when the OS isn't aggressively writing
out a mountain of dirty memory. This is something that I personally
have never seen before.

I also note that master had 3 huge background writer spikes with a
Gaussian distribution, rather than 2 and 1 small one, as was
(consistently) demonstrated to happen with a uniform distribution.
What's more, 90th percentile latency is very consistent across client
counts for the new patched test run, as opposed to being very much
higher with higher client counts when master is tested.

[1] http://www.postgresql.org/message-id/alpine.DEB.2.10.1404011107220.2557@sto
--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2014-04-24 23:40:30 Re: Composite Datums containing toasted fields are a bad idea(?)
Previous Message Andres Freund 2014-04-24 21:28:14 Re: slow startup due to LWLockAssign() spinlock