Non-linear Performance

From: Curt Sampson <cjs(at)cynic(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: Non-linear Performance
Date: 2002-05-30 11:12:57
Message-ID: Pine.NEB.4.43.0205301929110.455-100000@angelic.cynic.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


I'm noticing that performance in creating a particular index, and
also a little bit in a simple query, seems somewhat non-linear,
and I'm wondering if anybody could give me any clues as to what
might be causing this, and how I might fix it, if that's possible.

I've done a COPY in and index build on three data sets (just integer
data) consisting of 10m rows (500 MB table), 100m rows (5 GB table),
and 500m rows (25 GB table). (This is all on a 1.4 GHz P4, 512 MB RAM,
two 7200 RPM IDE drives, one for data and one for log.)

The COPY command to do the import is pretty linear, at about 230
sec., 2000 sec and 10,000 sec. for the 10m, 100m and 500m row
tables. Neither disk I/O nor CPU seem to be pushed particularly,
though I'm getting lots of small writes to the log files. I tried
using a larger checkpoint size, but it didn't seem to help any.

Recreating the primary key (on an INT, and of course all values
unique) took 123, 1300 and 6700 seconds, again pretty linear.

The next column is a DATE, which is the same value for the first
3.5M records, the next day for the next 3.5M records, and so on,
incrementing a day for every 3.5M records (in order). This index
build took about 60, 700 and 3500 seconds, respectively, again
linear.

But then I get to the next INT column which in every row is filled
in with a random value between 0 and 99,999. This index takes about
175, 3600, and 28,000 seconds seconds, respectively, to generate.
So it take about 2x as long per record going from 10m to 100m
records, and about 1.5x as long again per record when going from
100m to 500m records.

Queries using that index seem to do this too, though not quite as
badly. Using a very simple query such as "SELECT COUNT(*) FROM
table WHERE value = 12345" (where value is the last INT column
above that took ages to index), typical query times (including
connection overhead) for data not in the cache are 0.6 sec., 11
sec. and 72 sec.

This query, as expected, is completely dominated by random IO; the
disk the table is on sits there at 100% usage (i.e., disk requests
outstanding 100% of the time) and not much else is happening at all.

It does seem to do a few more more disk transfers than I would
really expect. I get back a count of around 4000-5000, which to me
implies about 5000 reads plus the index reads (which one would
think would not amount to more than one or two hundred pages), yet
110 I/O requests per second times 70 seconds implies about 7000
reads. Is there something I'm missing here?

(If the query plan and analysis is any help, here it is:

Aggregate (cost=22081.46..22081.46 rows=1 width=0)
(actual time=70119.88..70119.88 rows=1 loops=1) ->
Index Scan using data_3_value on data_3
(cost=0.00..22067.71 rows=5498 width=0)
(actual time=38.70..70090.45 rows=4980 loops=1)
Total runtime: 70121.74 msec

Anyway, I'm open to any thoughts on this. In particular, I'm open
to suggestions for cheap ways of dealing with this horrible random
I/O load. (Yeah, yeah, I know: disk array, and SCSI while I'm about
it. But I've been asked to get this sort of thing working fast on
systems much cheaper than the current Sun/Sybase/EMC or whatever
it is that they're using.)

cjs
--
Curt Sampson <cjs(at)cynic(dot)net> +81 90 7737 2974 http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light. --XTC

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tahira Aslam 2002-05-30 11:42:14 How to pickup null values in SQL Language?
Previous Message Joshua Drake 2002-05-30 10:47:07 Actual Marketing happening