Re: Much Ado About COUNT(*)

From: Rod Taylor <pg(at)rbt(dot)ca>
To: "Jonah H(dot) Harris" <jharris(at)tvi(dot)edu>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Much Ado About COUNT(*)
Date: 2005-01-12 20:09:04
Message-ID: 1105560544.690.34.camel@home
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-announce pgsql-hackers pgsql-patches

On Wed, 2005-01-12 at 12:52 -0700, Jonah H. Harris wrote:
> Tom Lane wrote:
>
> >The fundamental problem is that you can't do it without adding at least
> >16 bytes, probably 20, to the size of an index tuple header. That would
> >double the physical size of an index on a simple column (eg an integer
> >or timestamp). The extra I/O costs and extra maintenance costs are
> >unattractive to say the least. And it takes away some of the
> >justification for the whole thing, which is that reading an index is
> >much cheaper than reading the main table. That's only true if the index
> >is much smaller than the main table ...

> I recognize the added cost of implementing index only scans. As storage
> is relatively cheap these days, everyone I know is more concerned about
> faster access to data. Similarly, it would still be faster to scan the
> indexes than to perform a sequential scan over the entire relation for
> this case. I also acknowledge that it would be a negative impact to
> indexes where this type of acces isn't required, as you suggested and
> which is more than likely not the case. I just wonder what more people
> would be happier with and whether the added 16-20 bytes would be
> extremely noticable considering most 1-3 year old hardware.

I'm very much against this. After some quick math, my database would
grow by about 40GB if this was done. Storage isn't that cheap when you
include the hot-backup master, various slaves, RAM for caching of this
additional index space, backup storage unit on the SAN, tape backups,
additional spindles required to maintain same performance due to
increased IO because I don't very many queries which would receive an
advantage (big one for me -- we started buying spindles for performance
a long time ago), etc.

Make it a new index type if you like, but don't impose any new
performance constraints on folks who have little to no advantage from
the above proposal.

In response to

Responses

Browse pgsql-announce by date

  From Date Subject
Next Message Reinhard Max 2005-01-12 20:10:16 Re: [HACKERS] segfault caused by heimdal (was: SUSE port)
Previous Message Greg Stark 2005-01-12 20:08:37 Re: Much Ado About COUNT(*)

Browse pgsql-hackers by date

  From Date Subject
Next Message Reinhard Max 2005-01-12 20:10:16 Re: [HACKERS] segfault caused by heimdal (was: SUSE port)
Previous Message Greg Stark 2005-01-12 20:08:37 Re: Much Ado About COUNT(*)

Browse pgsql-patches by date

  From Date Subject
Next Message Reinhard Max 2005-01-12 20:10:16 Re: [HACKERS] segfault caused by heimdal (was: SUSE port)
Previous Message Greg Stark 2005-01-12 20:08:37 Re: Much Ado About COUNT(*)