Re: Bitmap index thoughts

From: Gavin Sherry <swm(at)alcove(dot)com(dot)au>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Bitmap index thoughts
Date: 2007-02-02 12:15:10
Message-ID: Pine.LNX.4.58.0702022301520.32019@linuxworld.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 1 Feb 2007, Bruce Momjian wrote:

>
> Where are we on this patch? Does it have performance tests to show
> where it is beneificial? Is it ready to be reviewed?

Here's an updated patch:

http://www.alcove.com.au/~swm/bitmap-2007-02-02.patch

In this patch, I rewrote the index build system. It was fast before for
well clustered data but for poorly clustered data, it was very slow. Now,
it is pretty good for each distribution type.

I have various test cases but the one which showed bitmap a poor light was
a table of 600M rows. The key to the table had a cardinality of 100,000.
When the table was loaded with keys clustered, the build time was 1000
seconds with bitmap (2200 with btree). With poorly clustered data (e.g.,
the index key was (1, 2, 3, ..., 6000, 1, 2, 3, ...)), the build time for
bitmap was 14000 seconds!

So, I rewrote this to compress data using HRL encoding (the same scheme we
use in the bitmap AM itself). Now, clustered data is just as fast and
unclustered data is 2000 seconds.

The select performance at a cardinality of 100,000 is similar to btree but
faster with lower cardinalities.

Jie also contributed a rewrite of the WAL code to this patch. Not only is
the code faster now, but it handles the notion of incomplete actions --
like btree and friends do. The executor code still needs some work from me
-- Jie and I have dirtied things up while experimenting -- but we would
really like some review of the code so that this can get squared away
well before the approach of 8.3 feature freeze.

One of the major deficiencies remaining is the lack of VACUUM support.
Heikki put his hand up for this and I'm holding him to it! ;-)

I will update the code tomorrow. The focus will be cleaning up the
executor modifications. Please look else where for now.

Thanks,

Gavin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Florian G. Pflug 2007-02-02 12:33:39 Re: Data archiving/warehousing idea
Previous Message Florian G. Pflug 2007-02-02 12:04:12 Re: Referential Integrity and SHARE locks