Re: Improving count(*)

From: mark(at)mark(dot)mielke(dot)cc
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: Rod Taylor <pg(at)rbt(dot)ca>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Improving count(*)
Date: 2005-11-18 00:48:13
Message-ID: 20051118004813.GA531@mark.mielke.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Probably obvious, and already mentioned, count(*) isn't the only query
that would benefit from visibility information in the index. It's
rather unfortunate that MVCC requires table lookups, when all values
queried or matched are found in the index key itself. The idea of an
all index table is appealing to me for some applications (Oracle
supports this, I believe?). In effect, a sorted, and searchable table,
that doesn't double in size, just because it is indexed.

So what's the real cost here? Larger index size to include the
visibility information (optional?) and UPDATE/DELETE need to
set expirations on the index rows as well as the table rows,
for only the indexes that have visibility information? A flag
in the table structure in memory to know whether the table has
any indexes with visibility information that require updating?

It doesn't sound that bad to me. Perhaps I just don't know better? :-)

The per-block counts idea, seem useful to me. A database that
frequently modifies every page of an index, would seem
inefficient. What if the per-block counts were kept, but associated
with index blocks instead of table blocks, for indexes that maintain
visibility information? The per-block counts only need to be able to
provide enough information for the reader to know whether the count is
valid, or invalid, perhaps updated at vacuum time?

The idea of a partial index, that keeps this information (visibility
information + per-block live row count cache) seems fascinating to
me. Many new optimization opportunities to hang myself with... :-)

Maybe PostgreSQL could be FASTER than other databases?

Or are we just dreaming?

Cheers,
mark

--
mark(at)mielke(dot)cc / markm(at)ncf(dot)ca / markm(at)nortel(dot)com __________________________
. . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder
|\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ |
| | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada

One ring to rule them all, one ring to find them, one ring to bring them all
and in the darkness bind them...

http://mark.mielke.cc/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joe Conway 2005-11-18 00:49:31 Re: Some array semantics issues
Previous Message Dann Corbit 2005-11-18 00:34:02 Re: Improving count(*)