Re: Non-linear Performance

From: Curt Sampson <cjs(at)cynic(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Non-linear Performance
Date: 2002-05-31 09:44:22
Message-ID: Pine.NEB.4.43.0205311811070.448-100000@angelic.cynic.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, 30 May 2002, Tom Lane wrote:

> > Queries using that index seem to do this too, though not quite as
> > badly. Using a very simple query such as "SELECT COUNT(*) FROM
> > table WHERE value = 12345" (where value is the last INT column
> > above that took ages to index), typical query times (including
> > connection overhead) for data not in the cache are 0.6 sec., 11
> > sec. and 72 sec.
>
> I guess that the smaller datasets would get proportionally more benefit
> from kernel disk caching.

Actually, I re-did the 100m row and 500m row queries from a cold
start of the machine, and I still get the same results: 10 sec. vs
70 sec. (Thus, 7x as long to query only 5x as much data.) So I
don't think caching is an issue here.

> Can you demonstrate that it actually did 7000 reads, and not 5000+?
> That extrapolation technique doesn't look to me like it has the
> accuracy to tell the difference. You might try setting show_query_stats
> (note the results go into the postmaster log, not to the client;
> perhaps we ought to change that someday).

Ah, I should have looked up and used show_query_stats in the first
place. Thanks for the tip. (BTW, yes, it would be really, really cool
if we could turn this on and off on a per-client basis, and have the
results sent to the client. As it stands, this facility is rather
difficult to use to try to debug things on a production system.)

You're right, it is doing around 5000 reads, not 7000. The results for
all queries are very consistent: it does just slightly (5-20) more
reads than the number of rows returned, which is exactly what one would
expect.

So it seems I somehow go from about 100 reads per second to 70
reads per second. Some of that would no doubt be due to having five
times as much area over which to seek. I guess I'd have to play
with some random-read benchmarks to see if that's a reasonable
amount of performance degradation for this disk alone to be
responsible for.

> Also, if you've updated the table at all, there might be some fetches of
> dead tuples involved.

Hm. I did do a few updates, but not to any of the tuples I'm searching on.

> More RAM, perhaps.

Given the size of the table (25 GB for the table alone; about 60 GB with
all of the indices), I don't think more RAM is going to help all that
much. I'm not expecting to get a lot of repeated queries.

cjs
--
Curt Sampson <cjs(at)cynic(dot)net> +81 90 7737 2974 http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light. --XTC

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Динар 2002-05-31 09:54:36 The trouble with dynamic plpgsql functions
Previous Message Martijn van Oosterhout 2002-05-31 09:07:31 Re: restoring DB