Quick Links

Re: count(*) slow on large tables

From:	Greg Stark <gsstark(at)mit(dot)edu>
To:	Christopher Browne <cbbrowne(at)libertyrms(dot)info>
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: count(*) slow on large tables
Date:	2003-10-03 05:13:08
Message-ID:	87brsyrjiz.fsf@stark.dyndns.tv
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers pgsql-performance

Christopher Browne <cbbrowne(at)libertyrms(dot)info> writes:

> It would be very hairy to implement it correctly, and all this would
> cover is the single case of "SELECT COUNT(*) FROM SOME_TABLE;"
>
> If you had a single WHERE clause attached, you would have to revert to
> walking through the tuples looking for the ones that are live and
> committed, which is true for any DBMS.

Well it would be handy for a few other cases as well.

1 It would be useful for the case where you have a partial index with a
matching where clause. The optimizer already considers using such indexes
but it has to pay the cost of the tuple lookup, which is substantial.

2 It would be useful for the very common queries of the form
WHERE x IN (select id from foo where some_indexed_expression)

(Or the various equivalent forms including outer joins that test to see if
the matching record was found and don't retrieve any other columns in the
select list.)

3 It would be useful for many-many relationships where the intermediate table
has only the two primary key columns being joined. If you create a
multi-column index on the two columns it shouldn't need to look up the
tuple. This would be effectively be nearly equivalent to an "index organized
table".

4 It would be useful for just about all the referential integrity queries...

I don't mean to say this is definitely a good thing. The tradeoff in
complexity and time to maintain the index pages would be large. But don't
dismiss it as purely a count(*) optimization hack.

I know Oracle is capable of it and it can speed up your query a lot when you
remove that last unnecessary column from a join table allowing oracle to skip
the step of reading the table.

--
greg

In response to

Re: count(*) slow on large tables at 2003-10-02 21:57:30 from Christopher Browne

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Shridhar Daithankar	2003-10-03 06:29:02	Re: count(*) slow on large tables
Previous Message	Greg Stark	2003-10-03 04:50:12	Re: minor view creation weirdness

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Josh Berkus	2003-10-03 05:27:08	Re: TPC-R benchmarks
Previous Message	Dror Matalon	2003-10-03 04:27:54	Re: count(*) slow on large tables