Re: Bitmap index status

From: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
To: Jie Zhang <jzhang(at)greenplum(dot)com>
Cc: swm(at)linuxworld(dot)com(dot)au, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Bitmap index status
Date: 2006-09-19 11:54:55
Message-ID: 450FDA8F.1040807@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Jie Zhang wrote:
> Hi Heikki and all,
>
> Please find the latest bitmap index patch in the attachment. This patch is
> generated against the postgresql cvs head.
>

Thanks.

The handling of stream and hash bitmaps looks pretty complicated to me.
All the bitmap-related nodes have logic to handle both types slightly
differently. It all seems to come down to that if a subnode (or
amgetbitmap in a bitmap index scan node) returns a StreamBitmap, the
caller needs to call the subnode many times, until it returns a NULL.
With a HashBitmap, the caller only calls the subnode once.

I think amgetbitmap should be called just once per index scan, and it
should return either a hash bitmap or a stream bitmap. The same applies
to all the executor nodes that return bitmaps, they would only return a
single HashBitmap or StreamBitmap, and the upper node would call
tbm_iterate repeatedly on that.

StreamBitmap would contain a callback (filled by the indexam) that
tbm_iterate would call to fill the next TBMIterateResult.

I would also move the code to AND and OR together different types of
bitmaps to tidbitmap.c, so that BitmapAnd and BitmapOr nodes wouldn't
need to care about different types either.

tbm_intersect and tbm_union with two HashBitmaps would work like they do
now. Calling tbm_intersect or tbm_union with two StreamBitmaps would
return a new StreamBitmap object that would pull results from the two
argument StreamBitmaps page at a time and AND or OR them together.
Combining a HashBitmap and a StreamBitmap would wrap the HashBitmap with
a new StreamBitmap that pulls the entries from the HashBitmap page at a
time.

What do you think? Would you like me to help implementing the above?

> This patch includes code style changes, bug fixes, and the stream bitmap
> implementation. I have also fixed the problems mentioned in Heikki's email.
>

It seems you fixed the race condition in inserttuple by holding a write
lock on the metapage while you find/create the lov item. While correct,
isn't that pretty bad for concurrency? I realize that the main use case
for bitmap indexes is large data warehouses where you don't do
concurrent updates, but still...

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Gavin Sherry 2006-09-19 12:15:24 Re: Bitmap index status
Previous Message Dave Page 2006-09-19 11:39:36 Re: vista