From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Curt Sampson <cjs(at)cynic(dot)net> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Non-linear Performance |
Date: | 2002-05-30 14:21:11 |
Message-ID: | 647.1022768471@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Curt Sampson <cjs(at)cynic(dot)net> writes:
> I'm noticing that performance in creating a particular index, and
> also a little bit in a simple query, seems somewhat non-linear,
Btree index build is primarily a sort, so cannot have better than
O(n*log(n)) performance for random data. Not sure why you'd expect
linearity.
Increasing SORT_MEM would help the constant factor, however...
> Queries using that index seem to do this too, though not quite as
> badly. Using a very simple query such as "SELECT COUNT(*) FROM
> table WHERE value = 12345" (where value is the last INT column
> above that took ages to index), typical query times (including
> connection overhead) for data not in the cache are 0.6 sec., 11
> sec. and 72 sec.
I guess that the smaller datasets would get proportionally more benefit
from kernel disk caching.
> It does seem to do a few more more disk transfers than I would
> really expect. I get back a count of around 4000-5000, which to me
> implies about 5000 reads plus the index reads (which one would
> think would not amount to more than one or two hundred pages), yet
> 110 I/O requests per second times 70 seconds implies about 7000
> reads. Is there something I'm missing here?
Can you demonstrate that it actually did 7000 reads, and not 5000+?
That extrapolation technique doesn't look to me like it has the
accuracy to tell the difference. You might try setting show_query_stats
(note the results go into the postmaster log, not to the client;
perhaps we ought to change that someday).
Also, if you've updated the table at all, there might be some fetches of
dead tuples involved.
> Anyway, I'm open to any thoughts on this. In particular, I'm open
> to suggestions for cheap ways of dealing with this horrible random
> I/O load.
More RAM, perhaps.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Peter A. Daly | 2002-05-30 14:26:37 | Re: Non-linear Performance |
Previous Message | Joshua b. Jore | 2002-05-30 14:14:49 | Re: Checksum/Hash Functions |