On this table
project_id | integer | not null
id | integer | not null
date | date | not null
team_id | integer | not null
work_units | bigint | not null
Indexes: email_contrib_pkey primary key btree (project_id, id, date)
with this breakdown of data
project_id | count
5 | 56427141
8 | 1058843
24 | 361595
25 | 4092575
205 | 58512516
Any kind of operation on an entire project wants to tablescan, even
though it's going to take way longer.
explain analyze select sum(work_units) from email_contrib where
Index scan 126, 56, 55 seconds
Seq. scan 1517, 850, 897 seconds
It seems like the metrics used for the cost of index scanning v. table
scanning on large tables need to be revisited. It might be such a huge
difference in this case because the table is essentially clustered on
the primary key. I can test this by doing an aggregate for, say, a
specific team_id, which would be pretty well spread across the entire
table, but that'll have to wait a bit.
Anyone have any thoughts on this? Also, is there a TODO to impliment
real clustered indexes? Doing stuff by project_id on this table in
sybase was very efficient, because there was a real clustered index on
the PK. By clustered index, I mean an index where the leaf nodes of the
B-tree were the actual table rows. This means the only overhead in going
through the index is scanning the branches, which in this case would be
Is this something that I should be using some PGSQL-specific feature
for, like inheritance?
I've been really happy so far with PGSQL (comming from Sybase and DB2),
but it seems there's still some pretty big performance issues that want
to be addressed (or I should say performance issues that hurt really big
when you hit them :) ).
Jim C. Nasby (aka Decibel!) jim(at)nasby(dot)net
Member: Triangle Fraternity, Sports Car Club of America
Give your computer some brain candy! www.distributed.net Team #1828
Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"
pgsql-performance by date
|Next:||From: Tom Lane||Date: 2003-04-24 23:58:30|
|Subject: Re: More tablescanning fun |
|Previous:||From: Frederic Jolliton||Date: 2003-04-24 17:40:17|
|Subject: Re: Important speed difference between a query and a|