Quick Links

Re: Much Ado About COUNT(*)

From:	Rod Taylor <pg(at)rbt(dot)ca>
To:	"Jonah H(dot) Harris" <jharris(at)tvi(dot)edu>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Much Ado About COUNT(*)
Date:	2005-01-12 20:09:04
Message-ID:	1105560544.690.34.camel@home
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-announce pgsql-hackers pgsql-patches

On Wed, 2005-01-12 at 12:52 -0700, Jonah H. Harris wrote:
> Tom Lane wrote:
>
> >The fundamental problem is that you can't do it without adding at least
> >16 bytes, probably 20, to the size of an index tuple header. That would
> >double the physical size of an index on a simple column (eg an integer
> >or timestamp). The extra I/O costs and extra maintenance costs are
> >unattractive to say the least. And it takes away some of the
> >justification for the whole thing, which is that reading an index is
> >much cheaper than reading the main table. That's only true if the index
> >is much smaller than the main table ...

> I recognize the added cost of implementing index only scans. As storage
> is relatively cheap these days, everyone I know is more concerned about
> faster access to data. Similarly, it would still be faster to scan the
> indexes than to perform a sequential scan over the entire relation for
> this case. I also acknowledge that it would be a negative impact to
> indexes where this type of acces isn't required, as you suggested and
> which is more than likely not the case. I just wonder what more people
> would be happier with and whether the added 16-20 bytes would be
> extremely noticable considering most 1-3 year old hardware.

I'm very much against this. After some quick math, my database would
grow by about 40GB if this was done. Storage isn't that cheap when you
include the hot-backup master, various slaves, RAM for caching of this
additional index space, backup storage unit on the SAN, tape backups,
additional spindles required to maintain same performance due to
increased IO because I don't very many queries which would receive an
advantage (big one for me -- we started buying spindles for performance
a long time ago), etc.

Make it a new index type if you like, but don't impose any new
performance constraints on folks who have little to no advantage from
the above proposal.

In response to

Re: Much Ado About COUNT(*) at 2005-01-12 19:52:53 from Jonah H. Harris

Responses

Re: Much Ado About COUNT(*) at 2005-01-12 20:59:07 from Jonah H. Harris
Re: Much Ado About COUNT(*) at 2005-01-12 21:45:51 from Simon Riggs

Browse pgsql-announce by date

	From	Date	Subject
Next Message	Reinhard Max	2005-01-12 20:10:16	Re: [HACKERS] segfault caused by heimdal (was: SUSE port)
Previous Message	Greg Stark	2005-01-12 20:08:37	Re: Much Ado About COUNT(*)

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Reinhard Max	2005-01-12 20:10:16	Re: [HACKERS] segfault caused by heimdal (was: SUSE port)
Previous Message	Greg Stark	2005-01-12 20:08:37	Re: Much Ado About COUNT(*)

Browse pgsql-patches by date

	From	Date	Subject
Next Message	Reinhard Max	2005-01-12 20:10:16	Re: [HACKERS] segfault caused by heimdal (was: SUSE port)
Previous Message	Greg Stark	2005-01-12 20:08:37	Re: Much Ado About COUNT(*)