Re: Non-linear Performance

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Curt Sampson <cjs(at)cynic(dot)net>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Non-linear Performance
Date: 2002-05-30 14:21:11
Message-ID: 647.1022768471@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Curt Sampson <cjs(at)cynic(dot)net> writes:
> I'm noticing that performance in creating a particular index, and
> also a little bit in a simple query, seems somewhat non-linear,

Btree index build is primarily a sort, so cannot have better than
O(n*log(n)) performance for random data. Not sure why you'd expect
linearity.

Increasing SORT_MEM would help the constant factor, however...

> Queries using that index seem to do this too, though not quite as
> badly. Using a very simple query such as "SELECT COUNT(*) FROM
> table WHERE value = 12345" (where value is the last INT column
> above that took ages to index), typical query times (including
> connection overhead) for data not in the cache are 0.6 sec., 11
> sec. and 72 sec.

I guess that the smaller datasets would get proportionally more benefit
from kernel disk caching.

> It does seem to do a few more more disk transfers than I would
> really expect. I get back a count of around 4000-5000, which to me
> implies about 5000 reads plus the index reads (which one would
> think would not amount to more than one or two hundred pages), yet
> 110 I/O requests per second times 70 seconds implies about 7000
> reads. Is there something I'm missing here?

Can you demonstrate that it actually did 7000 reads, and not 5000+?
That extrapolation technique doesn't look to me like it has the
accuracy to tell the difference. You might try setting show_query_stats
(note the results go into the postmaster log, not to the client;
perhaps we ought to change that someday).

Also, if you've updated the table at all, there might be some fetches of
dead tuples involved.

> Anyway, I'm open to any thoughts on this. In particular, I'm open
> to suggestions for cheap ways of dealing with this horrible random
> I/O load.

More RAM, perhaps.

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Peter A. Daly 2002-05-30 14:26:37 Re: Non-linear Performance
Previous Message Joshua b. Jore 2002-05-30 14:14:49 Re: Checksum/Hash Functions