Re: Implementing Bitmap Indexes

From: Mike Rylander <mrylander(at)gmail(dot)com>
To: Neil Conway <neilc(at)samurai(dot)com>
Cc: Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Implementing Bitmap Indexes
Date: 2005-01-30 01:48:32
Message-ID: b918cf3d050129174847628a03@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, 30 Jan 2005 12:15:20 +1100, Neil Conway <neilc(at)samurai(dot)com> wrote:
> It might _work_, I just don't see the point. Given an attribute of a
> heap relation that has N distinct values and T tuples, you need to store
>
> - N bitmaps, each of T bits (before compression)
> - T ctids
> - a way to map from a bit in one of the bitmaps to a heap tuple
> - a way to decide which bitmap(s) to use for a given index scan
>
> I don't see why it's a win to organize this data in a tree. Why not
> store the ctids in a simple array? You then know that bit K of any
> bitmap refers to entry K of the ctid array. You'd also need some meta
> data to figure out which bitmap to use for a given scankey, but it
> should be pretty easy to do that efficiently.

OK, I think it just clicked. I was seeing a tree for the initial
lookup to find the right bitmaps to scan. Does that seem like to much
overhead for the first step?

So, pick the bitmap(s) based on the key, scan the bitmaps and combine
them based on the WHERE condition combination type, and as you find
matching bits you toss the ctids into a "matching" array. Then it's a
fast ctid scan. That it? I'm very interested in this after reading a
bit (heh he) about bitmap indexes. Here's how I'm visualizing it now:

For a query like "SELECT * FROM table WHERE a IN (1,3)" ...

Index on "table.a" looks like:

bitmaps
1 | 001001001001000
2 | 100000010100001
3 | 010110100010110

ctids
1 | {2,5,8,11}
2 | {0,7,9,14}
3 | {1,3,4,6,10,12,13}

The index scan would do bitwise a OR on bitmaps 1 and 3, find the
possition of the "1"s, jump to those possitions in the ctid array, and
bounce to the heap for the value.

--
Mike Rylander
mrylander(at)gmail(dot)com
GPLS -- PINES Development
Database Developer
http://open-ils.org

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message John Hansen 2005-01-30 01:56:58 Bug in create operator and/or initdb
Previous Message Neil Conway 2005-01-30 01:15:20 Re: Implementing Bitmap Indexes