Thanks for the response.
Heikki Linnakangas wrote:
> Vladimir Stankovic wrote:
>> I'm running write-intensive, TPC-C like tests. The workload consist
>> of 150 to 200 thousand transactions. The performance varies
>> dramatically, between 5 and more than 9 hours (I don't have the exact
>> figure for the longest experiment). Initially the server is
>> relatively fast. It finishes the first batch of 50k transactions in
>> an hour. This is probably due to the fact that the database is
>> RAM-resident during this interval. As soon as the database grows
>> bigger than the RAM the performance, not surprisingly, degrades,
>> because of the slow disks.
>> My problem is that the performance is rather variable, and to me
>> non-deterministic. A 150k test can finish in approx. 3h30mins but
>> conversely it can take more than 5h to complete.
>> Preferably I would like to see *steady-state* performance (where my
>> interpretation of the steady-state is that the average
>> throughput/response time does not change over time). Is the
>> steady-state achievable despite the MVCC and the inherent
>> non-determinism between experiments? What could be the reasons for
>> the variable performance?
> Steadiness is a relative; you'll never achieve perfectly steady
> performance where every transaction takes exactly X milliseconds. That
> said, PostgreSQL is not as steady as many other DBMS's by nature,
> because of the need to vacuum. Another significant source of
> unsteadiness is checkpoints, though it's not as bad with fsync=off,
> like you're running.
What I am hoping to see is NOT the same value for all the executions of
the same type of transaction (after some transient period). Instead, I'd
like to see that if I take appropriately-sized set of transactions I
will see at least steady-growth in transaction average times, if not
exactly the same average. Each chunk would possibly include sudden
performance drop due to the necessary vacuum and checkpoints. The
performance might be influenced by the change in the data set too.
I am unhappy about the fact that durations of experiments can differ
even 30% (having in mind that they are not exactly the same due to the
non-determinism on the client side) . I would like to eliminate this
variability. Are my expectations reasonable? What could be the cause(s)
of this variability?
> I'd suggest using the vacuum_cost_delay to throttle vacuums so that
> they don't disturb other transactions as much. You might also want to
> set up manual vacuums for the bigger tables, instead of relying on
> autovacuum, because until the recent changes in CVS head, autovacuum
> can only vacuum one table at a time, and while it's vacuuming a big
> table, the smaller heavily-updated tables are neglected.
>> The database server version is 8.1.5 running on Fedora Core 6.
> How about upgrading to 8.2? You might also want to experiment with CVS
> HEAD to get the autovacuum improvements, as well as a bunch of other
> performance improvements.
I will try these, but as I said my primary goal is to have
steady/'predictable' performance, not necessarily to obtain the fastest
Vladimir Stankovic T: +44 20 7040 0273
Research Student/Research Assistant F: +44 20 7040 8585
Centre for Software Reliability E: V(dot)Stankovic(at)city(dot)ac(dot)uk
Northampton Square, London EC1V 0HB
In response to
pgsql-performance by date
|Next:||From: Sabin Coanda||Date: 2007-06-12 15:54:12|
|Subject: VACUUM vs auto-vacuum daemon|
|Previous:||From: Steinar H. Gunderson||Date: 2007-06-12 14:59:10|
|Subject: Re: test / live environment, major performance difference|