Re: Non-linear Performance

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Curt Sampson <cjs(at)cynic(dot)net>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Non-linear Performance
Date: 2002-05-31 15:05:34
Message-ID: 18241.1022857534@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Curt Sampson <cjs(at)cynic(dot)net> writes:
> On Thu, 30 May 2002, Tom Lane wrote:
>> I guess that the smaller datasets would get proportionally more benefit
>> from kernel disk caching.

> Actually, I re-did the 100m row and 500m row queries from a cold
> start of the machine, and I still get the same results: 10 sec. vs
> 70 sec. (Thus, 7x as long to query only 5x as much data.) So I
> don't think caching is an issue here.

But even from a cold start, there would be cache effects within the
query, viz. fetching the same table block more than once when it is
referenced from different places in the index. On the smaller table,
the block is more likely to still be in kernel cache when it is next
wanted.

On a pure random-chance basis, you'd not expect that fetching 5k rows
out of 100m would hit the same table block twice --- but I'm wondering
if the data was somewhat clustered. Do the system usage stats on your
machine reflect the difference between physical reads and reads
satisfied from kernel buffer cache?

Or maybe your idea about extra seek time is correct.

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message shey sewani 2002-05-31 15:25:48 Row Limit on tables
Previous Message Doug Fields 2002-05-31 14:49:27 Re: Non-linear Performance