Re: [PATCHES] WIP: bitmap indexes

From: "Jie Zhang" <jzhang(at)greenplum(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Gavin Sherry" <swm(at)linuxworld(dot)com(dot)au>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCHES] WIP: bitmap indexes
Date: 2006-08-16 07:03:26
Message-ID: C108114E.A55E%jzhang@greenplum.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

On 8/15/06 6:18 AM, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Gavin Sherry <swm(at)linuxworld(dot)com(dot)au> writes:
>> On Mon, 14 Aug 2006, Tom Lane wrote:
>>> Correct me if I'm wrong, but isn't the patch's present hacking on the
>>> executor intended to make it happen like this?
>
>> Not really. It reads ahead on the bitmap index and passes back the bitmap
>> words. The other executor routines are hacked up to process the data in
>> this format.
>
> Well, as I said, I don't think there's justification for exposing a
> bitmap index's internal data formats to the rest of the system like
> that: it's not very future-proof and I don't see that it's buying any
> significant performance gain. At some point you have to convert to TIDs
> anyway, at least in the sense of knowing what page and line number you
> are at, so passing the data as an array of TIDs really isn't going to
> hurt much. So my advice is to rip out all those changes and go back to
> the existing tidbitmap.c readout API. There's nothing wrong with
> the TBMIterateResult data structure.
>

The bitmap words in the bitmap index are very simple and can be very
generic. You can think about them as one bit per tuple along with some
padding bits between heap pages. The problem I have is that I do not know a
good way to construct an in-memory version of this for other index
structures, like b-tree. To be able to handle both cases nicely, you are
right -- TBMIterateResult is better. Or, PagetableEntry may be better since
it will make AND/OR easier.

> What I do find interesting to think about is whether, strictly within
> tidbitmap.c, there could be an alternate kind of bitmap object that
> doesn't have to materialize the whole bitmap for an indexscan in memory
> because it knows it can fetch the data on-demand, ie, build the next
> page TBMIterateResult data structure on-the-fly from the index when it's
> requested. Call it a "stream bitmap" in contrast to the present
> "materialized bitmaps". The trick here is to be able to AND and OR a
> stream bitmap with another stream bitmap or a materialized bitmap.
> I don't see any reason in principle why that couldn't be done: the
> output of the AND/OR would be a stream of TBMIterateResults just like
> the inputs. That is, it's another stream bitmap, but with a different
> generating function and some internal state that includes its source
> bitmaps. You'd have to sort a materialized bitmap into order before
> starting to AND/OR it with a stream bitmap, but that code is there
> already.

I like this idea. I think that we can define a new TBMStatus to be
TBM_STREAM in TIDBitmap. *getmulti functions will remain the same, except
that we add a new returning bool argument, stating if this is a stream
bitmap. If this is a stream bitmap, nodeBitmapIndexScan simply fills spages,
and passes it upstream. When nodeBitmapAnd or nodeBitmapOr ANDs/ORs several
bitmaps, the result bitmap is a stream bitmap if there is at least one
bitmap is a stream bitmap. Then we add another loop in nodeBitmapHeapscan to
be able to pull more data from its subnode.

Thanks,
Jie

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message andrew 2006-08-16 07:37:33 Re: Enum proposal / design
Previous Message Simon Riggs 2006-08-16 06:05:47 Re: [HACKERS] Forcing current WAL file to be archived

Browse pgsql-patches by date

  From Date Subject
Next Message Tom Lane 2006-08-16 12:51:21 Re: [HACKERS] Forcing current WAL file to be archived
Previous Message Simon Riggs 2006-08-16 06:05:47 Re: [HACKERS] Forcing current WAL file to be archived