Re: osdl-dbt3 run results - puzzled by the execution

From: Jenny Zhang <jenny(at)osdl(dot)org>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, perf-pgsql <pgsql-performance(at)postgresql(dot)org>
Subject: Re: osdl-dbt3 run results - puzzled by the execution
Date: 2003-09-19 23:26:54
Message-ID: 1064014014.442.189.camel@ibm-a.pdx.osdl.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

On Fri, 2003-09-19 at 06:12, Greg Stark wrote:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
>
> > I think this is a pipe dream. Variation in where the data gets laid
> > down on your disk drive would alone create more than that kind of delta.
> > I'm frankly amazed you could get repeatability within 2-3%.
>
> I think the reason he gets good repeatability is because he's talking about
> the aggregate results for a whole test run. Not individual queries. In theory
> you could just run the whole test multiple times. The more times you run it
> the lower the variation in the total run time would be.
>
That is right. The repeatability is due to the aggregate results for a
whole test run. As for individual query, the power test(single stream)
is very consistent, and the throughput test(multiple streams), any given
query execution time varies up to 15% if no swapping. If we set
sort_mem too high and swapping occurs, the variation is bigger.

> Actually, the variation in run time is also a useful statistic, both for
> postgres and the kernel. It might be useful to do multiple complete runs and
> keep track of the average standard deviation of the time required for each
> step.
>
I created a page with the execution time(in seconds), average, and
stddev for each query and each steps. The data is collected from 6 dbt3
runs.
http://developer.osdl.org/~jenny/pgsql-optimizer/exetime.html

> Higher standard deviation implies queries can't be reliably depended on not to
> take inordinately long, which can be a problem for some working models. For
> the kernel it could mean latency issues or it could mean the swapper or buffer
> cache was overly aggressive.
I agree. I can think of another reason why the performance varies even
the swapper and buffer cache is not overly aggressive. Since PG depends
on OS to manage the buffer cache(correct me if I am wrong), it is up to
OS to decide what to keep in the cache. And OS can not anticipate what
is likely needed next.

Thanks,
Jenny

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2003-09-19 23:28:34 Re: PostgreSQL not ACID compliant?
Previous Message Tom Lane 2003-09-19 23:23:03 Re: semtimedop instead of setitimer/semop/setitimer

Browse pgsql-performance by date

  From Date Subject
Next Message Beauty Center 2003-09-20 17:21:26 Sexier, Plumper L|ps can be yours only $24.76
Previous Message Jenny Zhang 2003-09-19 21:35:41 Re: osdl-dbt3 run results - puzzled by the execution