On Sat, Feb 4, 2012 at 2:13 PM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
> We really need to nail that down. Could you post the scripts (on the
> wiki) you use for running the benchmark and making the graph? I'd
> like to see how much work it would be for me to change it to detect
> checkpoints and do something like color the markers blue during
> checkpoints and red elsewhen.
They're pretty crude - I've attached them here.
> Also, I'm not sure how bad that graph really is. The overall
> throughput is more variable, and there are a few latency spikes but
> they are few. The dominant feature is simply that the long-term
> average is less than the initial burst.Of course the goal is to have
> a high level of throughput with a smooth latency under sustained
> conditions. But to expect that that long-sustained smooth level of
> throughput be identical to the "initial burst throughput" sounds like
> more of a fantasy than a goal.
That's probably true, but the drop-off is currently quite extreme.
The fact that disabling full_page_writes causes throughput to increase
by >4x is dismaying, at least to me.
> If we want to accept the lowered
> throughput and work on the what variability/spikes are there, I think
> a good approach would be to take the long term TPS average, and dial
> the number of clients back until the initial burst TPS matches that
> long term average. Then see if the spikes still exist over the long
> term using that dialed back number of clients.
Hmm, I might be able to do that.
> I don't think the full-page-writes are leading to WALInsert
> contention, for example, because that would probably lead to smooth
> throughput decline, but not those latency spikes in which those entire
> seconds go by without transactions.
> I doubt it is leading to general
> IO compaction, as the IO at that point should be pretty much
> sequential (the checkpoint has not yet reached the sync stage, the WAL
> is sequential). So I bet that that is caused by fsyncs occurring at
> xlog segment switches, and the locking that that entails.
That's definitely possible.
> If I
> recall, we can have a segment which is completely written to OS and in
> the process of being fsynced, and we can have another segment which is
> in some state of partially in wal_buffers and partly written out to OS
> cache, but that we can't start reusing the wal_buffers that were
> already written to OS for that segment (and therefore are
> theoretically available for reuse by the upcoming 3rd segment) until
> the previous segments fsync has completed. So all WALInsert's freeze.
> Or something like that. This code has changed a bit since last time
> I studied it.
Yeah, I need to better-characterize where the pauses are coming from,
but I'm reluctant to invest too much effort in until Heikki's xlog
scaling patch goes in, because I think that's going to change things
enough that any work done now will mostly be wasted.
It might be worth trying a run with wal_buffers=32MB or something like
that, just to see whether that mitigates any of the locking pile-ups.
The Enterprise PostgreSQL Company
Description: application/octet-stream (1.6 KB)
Description: application/octet-stream (1.1 KB)
In response to
pgsql-hackers by date
|Next:||From: Alvaro Herrera||Date: 2012-02-06 15:02:48|
|Subject: Re: Dry-run mode for pg_archivecleanup|
|Previous:||From: Alvaro Herrera||Date: 2012-02-06 14:31:20|
|Subject: Re: freezing multixacts|