Re: checkpoint patches

From: Jim Nasby <jim(at)nasby(dot)net>
To: Greg Smith <greg(at)2ndQuadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: checkpoint patches
Date: 2012-04-05 18:23:37
Message-ID: 4F7DE329.4040401@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 4/3/12 11:30 PM, Greg Smith wrote:
> On 03/25/2012 04:29 PM, Jim Nasby wrote:
>> Another $0.02: I don't recall the community using pg_bench much at all to measure latency... I believe it's something fairly new. I point this out because I believe there are differences in analysis that you need to do for TPS vs latency. I think Robert's graphs support my argument; the numeric X-percentile data might not look terribly good, but reducing peak latency from 100ms to 60ms could be a really big deal on a lot of systems. My intuition is that one or both of these patches actually would be valuable in the real world; it would be a shame to throw them out because we're not sure how to performance test them...
>
> One of these patches is already valuable in the real world. There it will stay, while we continue mining it for nuggets of deeper insight into the problem that can lead into a better test case.
>
> Starting at pgbench latency worked out fairly well for some things. Last year around this time I published some results I summarized at http://blog.2ndquadrant.com/en/gregs-planetpostgresql/2011/02/ , which included things like worst-case latency going from <=34 seconds on ext3 to <=5 seconds on xfs.
>
> The problem I keep hitting now is that 2 to 5 second latencies on Linux are extremely hard to get rid of if you overwhelm storage--any storage. That's where the wall is, where if you try to drive them lower than that you pay some hard trade-off penalties, if it works at all.
>
> Take a look at the graph I've attached. That's a slow drive not able to keep up with lots of random writes stalling, right? No. It's a Fusion-io card that will do 600MB/s of random I/O. But clog it up with an endless stream of pgbench writes, never with any pause to catch up, and I can get Linux to clog it for many seconds whenever I set it loose.

If there's a fundamental flaw in how linux deals with heavy writes that means you can't rely on certain latency windows, perhaps we should be looking at using a different OS to test those cases...
--
Jim C. Nasby, Database Architect jim(at)nasby(dot)net
512.569.9461 (cell) http://jim.nasby.net

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2012-04-05 18:27:07 Re: Last gasp
Previous Message Tom Lane 2012-04-05 18:23:03 Re: Last gasp