Re: Strange count(*) implementation?

From: Richard Huxton <dev(at)archonet(dot)com>
To: Henk Ernst Blok <h(dot)e(dot)blok(at)utwente(dot)nl>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Strange count(*) implementation?
Date: 2004-10-26 08:53:39
Message-ID: 417E1093.1080401@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Henk Ernst Blok wrote:
> Hi Posgres users/developers,
>
> Can anyone explain why PosgreSQL (version 7.4.5 on Linux) does a full
> table scan to compute a count(*) on a base table after a vacuum analyze
> has been done with no following updates that might have outdated any
> statistics. Strangly the explain command does give the correct number of
> tuples instantaniously from the catalog, as one would expect. Still the
> optimizer thinks it needs a full table scan to do count.

> The consequence of this seemingly odd count implementation is a very
> very slow count.

To put it simply, count() doesn't look at the statistics because most of
the time they are out of date. In any case, they aren't useful for any
query with a WHERE clause.

The next most obvious choice is to use the primary-key index rather than
scanning the table. However, MVCC means that we can have multiple views
of the table, with some backends seeing a different number of rows than
others. So - either we need to store multiple index-entry versions as
well as multiple row versions or you need to check the actual row in
these cases. PostgreSQL does the second, which results in the full scan
which you see.

There is plenty of discussion of this (and also max()/min() aggregate
functions) in the mailing list archives.

HTH
--
Richard Huxton
Archonet Ltd

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Joel 2004-10-26 09:22:55 compatibilityissues from 7.1 to 7.4
Previous Message Richard Huxton 2004-10-26 08:45:42 Re: what could cause inserts getting queued up and db locking??