Re: count(*) and bad design was: Experiences with extensibility

From: "Harald Armin Massa" <haraldarminmassa(at)gmail(dot)com>
To: "Ivan Sergio Borgonovo" <mail(at)webthatworks(dot)it>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: count(*) and bad design was: Experiences with extensibility
Date: 2008-01-09 12:04:39
Message-ID: 7be3f35d0801090404n67398980s4bc97f4e4448dcf4@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Ivan,

> Please forgive my naiveness in this field but what does it mean an
> "exact count" and what other DB means with "an exact count" and how
> other DB deal with it?

PostgreSQL will give you an exact count of the contents of the
database as it is in the moment you begin your count. (i.e. the
transaction starts)

BUT as the table is not locked, in parallel somebody can bulkload MANY
items into the database, so at the moment (start of your transaction)
+ 1msec your count may be invalid allready.

> I'd expect it perform as good or better than other DB since now the
> bottleneck should be how efficiently it can filter records... but
> still a count(*) with a where clause will incur in the same problem
> of what "exact" means.

I know of 3 concepts to answer count() faster then PostreSQL:

1) just lie, present an estimate
2) do not have MVCC
3) store "record deleted info" in index, so you can answer count()
with only scanning the index

Harald

--
GHUM Harald Massa
persuadere et programmare
Harald Armin Massa
Spielberger Straße 49
70435 Stuttgart
0173/9409607
fx 01212-5-13695179
-
EuroPython 2008 will take place in Vilnius, Lithuania - Stay tuned!

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Michael Glaesemann 2008-01-09 12:38:46 Re: Read/Write restriction mechanism
Previous Message Glyn Astill 2008-01-09 11:35:57 Startup scripts - Use -m fast or smart?