Re: checkpointer continuous flushing

From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: checkpointer continuous flushing
Date: 2016-03-17 21:14:47
Message-ID: alpine.DEB.2.10.1603172153220.28507@sto
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


>> Is it possible to run tests with distinct table spaces on those many disks?
>
> Nope, that'd require reconfiguring the system (and then back), and I don't
> have access to that system (just SSH).

Ok.

> Also, I don't quite see what would that tell us?

Currently the flushing context is shared between table space, but I think
that it should be per table space. My tests did not manage to convince
Andres, so getting some more figures would be great. That will be another
time!

>> I would have suggested using the --latency-limit option to filter out
>> very slow queries, otherwise if the system is stuck it may catch up
>> later, but then this is not representative of "sustainable" performance.
>>
>> When pgbench is running under a target rate, in both runs the
>> transaction distribution is expected to be the same, around 5000 tps,
>> and the green run looks pretty ok with respect to that. The magenta one
>> shows that about 25% of the time, things are not good at all, and the
>> higher figures just show the catching up, which is not really
>> interesting if you asked for a web page and it is finally delivered 1
>> minutes later.
>
> Maybe. But that'd only increase the stress on the system, possibly causing
> more issues, no? And the magenta line is the old code, thus it would only
> increase the improvement of the new code.

Yes and no. I agree that it stresses the system a little more, but the
fact that you have 5000 tps in the end does not show that you can really
sustain 5000 tps with reasonnable latency. I find this later information
more interesting than knowing that you can get 5000 tps on average,
thanks to some catching up. Moreover the non throttled runs already shown
that the system could do 8000 tps, so the bandwidth is already there.

> Notice the max latency is in microseconds (as logged by pgbench), so
> according to the "max latency" charts the latencies are below 10 seconds
> (old) and 1 second (new) about 99% of the time.

AFAICS, the max latency is aggregated by second, but then it does not say
much about the distribution of individuals latencies in the interval, that
is whether they were all close to the max or not, Having the same chart
with median or average might help. Also, with the stddev chart, the
percent do not correspond with the latency one, so it may be that the
latency is high but the stddev is low, i.e. all transactions are equally
bad on the interval, or not.

So I must admit that I'm not clear at all how to interpret the max latency
& stddev charts you provided.

> So I don't think this would make any measurable difference in practice.

I think that it may show that 25% of the time the system could not match
the target tps, even if it can handle much more on average, so the tps
achieved when discarding late transactions would be under 4000 tps.

--
Fabien.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2016-03-17 21:15:30 Re: snapshot too old, configured by time
Previous Message James Sewell 2016-03-17 21:13:09 Re: Choosing parallel_degree