Skip site navigation (1) Skip section navigation (2)

Re: Much Ado About COUNT(*)

From: "Dann Corbit" <DCorbit(at)connx(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>,"Jonah H(dot) Harris" <jharris(at)tvi(dot)edu>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Much Ado About COUNT(*)
Date: 2005-01-12 19:57:11
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-hackers
A notion for indices that are not unique... (won't help much on select
count(*) but might be helpful for other types of query optimization)

Put a count in the index for each distinct type.
In the worst case, the index is actually unique and you have 8 wasted
bytes per index entry and all the entries are in the leaves (perhaps it
could be an OPTION for some tables).  I don't know enough about the
structure of PostgreSQL's indexes to know if my suggestion is pure
hogwash, so don't laugh to hard if it is pure stupidity.

The most practical value of SELECT COUNT(*) is for updating statistics
(and looking good in phony-baloney benchmarks).  But the statistics only
need to be updated when you vacuum, so it hardly seems a crucial issue
to me.

-----Original Message-----
From: pgsql-hackers-owner(at)postgresql(dot)org
[mailto:pgsql-hackers-owner(at)postgresql(dot)org] On Behalf Of Tom Lane
Sent: Wednesday, January 12, 2005 11:42 AM
To: Jonah H. Harris
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [HACKERS] Much Ado About COUNT(*) 

"Jonah H. Harris" <jharris(at)tvi(dot)edu> writes:
> My thinking is that we may be able to implement index usage for not
> unqualified counts, but also on any query that can be satisfied by the

> index itself.

The fundamental problem is that you can't do it without adding at least
16 bytes, probably 20, to the size of an index tuple header.  That would
double the physical size of an index on a simple column (eg an integer
or timestamp).  The extra I/O costs and extra maintenance costs are
unattractive to say the least.  And it takes away some of the
justification for the whole thing, which is that reading an index is
much cheaper than reading the main table.  That's only true if the index
is much smaller than the main table ...

			regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if
      joining column's datatypes do not match

pgsql-hackers by date

Next:From: Tom LaneDate: 2005-01-12 19:59:42
Subject: Re: [HACKERS] segfault caused by heimdal (was: SUSE port)
Previous:From: Jonah H. HarrisDate: 2005-01-12 19:52:53
Subject: Re: Much Ado About COUNT(*)

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group