Re: Improving count(*)

From: mark(at)mark(dot)mielke(dot)cc
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Richard Huxton <dev(at)archonet(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Improving count(*)
Date: 2005-11-23 00:51:04
Message-ID: 20051123005104.GA450@mark.mielke.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 22, 2005 at 06:11:01PM -0500, Bruce Momjian wrote:
> mark(at)mark(dot)mielke(dot)cc wrote:
> > A solution enhancing the above mentioned indexes, to maintain a count
> > for whole index blocks, would allow whole index blocks that satisfy
> > the WHERE clause to be counted, assuming the whole index block is
> > visible in the current transaction.
> I think it would be very difficult to generate a per-index-page
> visibility bit because I can't think of a normal database operation that
> would allow us to update it. It requires that an index scan visit every
> heap page to check the visibility of the tuples. However, we almost
> never do a full-index scan because it is so much slower than a heap
> scan. It would be easy to keep a heap-visible bit up-to-date (because
> we do full-heap scans occasionally), but that would require the index
> to load the heap page to find the bit. (The bit isn't on the index, it
> is on the heap).

Vacuum time?

> Jan has been talking about have a bitmap to track pages that need
> vacuuming, and I am wondering if the same system could be used to track
> the heap-dirty bits. Putting one bit on every 8k disk page means we have
> to load the 8k page to read the bit, while a centralized bitmap would
> load 64k page bits in a single 8k page. That one page would cover 500MB
> of a table. Seems vacuum could use the same bitmap values.

Sounds correct.

> Assuming we track 100 tables regularly, that would require 800k of shared
> memory. We would have to pagein/out the bitmaps as needed, but we could
> throw them away on a crash and rebuild as part of normal operation.
> FSM has not been a concurrency bottleneck, so I am thinking this would
> not be either.
> I suppose it would require a new filesystem file for each table.

*nod*

Cheers,
mark

--
mark(at)mielke(dot)cc / markm(at)ncf(dot)ca / markm(at)nortel(dot)com __________________________
. . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder
|\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ |
| | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada

One ring to rule them all, one ring to find them, one ring to bring them all
and in the darkness bind them...

http://mark.mielke.cc/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Philip Yarra 2005-11-23 01:04:30 Re: tablespaces and non-empty directories
Previous Message Chuck McDevitt 2005-11-23 00:46:01 Re: [PATCHES] Should libedit be preferred to