Re: On-disk bitmap index patch

From: Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jie Zhang <jzhang(at)greenplum(dot)com>, pgsql-hackers(at)postgresql(dot)org, Luke Lonergan <LLonergan(at)greenplum(dot)com>
Subject: Re: On-disk bitmap index patch
Date: 2006-07-24 00:06:23
Message-ID: Pine.LNX.4.58.0607240952250.19854@linuxworld.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, 23 Jul 2006, Tom Lane wrote:

> Gavin Sherry <swm(at)linuxworld(dot)com(dot)au> writes:
> > Is anyone else looking at this patch?
>
> It's on my list of things to look at, but so are a lot of other patches ;-)
>
> A couple of comments after a *very* fast scan through the patch:
>
> * The xlog routines need help; they seem to not be updated for recent
> changes in the API for xlog recovery code.

Yep. The patch was actually against 8.1 and was hastily brought up to 8.2.
I think Jie's intention was to simply let everyone know that this was
going on.

>
> * The hacks on vacuum.c (where it tests the AM name) are utterly
> unacceptable. If you need the main code to do something special for a
> particular index AM, define a bool flag column for it in pg_am.

Yes.

>
> * The interface to the existing executor bitmap scan code is mighty
> messy --- seems like there's a lot of almost duplicate code, a lot
> of rather invasive hacking, etc. This needs to be rethought and
> refactored.

I agree.

>
> * Why in the world is it introducing duplicate copies of a lot
> of btree comparison functions? Use the ones that are there.

Yes, I raised this with Jie and she has fixed it. One thought is, we may
want to rename those comparison functions prefixed with 'bm' to make their
naming less confusing. They'll be used by btree, gin and bitmap index
methods. Anyway, a seperate patch.

>
> * The patch itself is a mess: it introduces .orig and .rej files,
> changes around $PostgreSQL$ lines, etc.
>

Right, not to mention patches to configure and a lot of style which needs
to be knocked into shape.

> However, the main problem I've got with this is that a new index AM is a
> pretty large burden, and no one's made the slightest effort to sell
> pghackers on taking this on. What's the use-case, what's the
> performance like, where is it really an improvement over what we've got?

For low cardinality sets, bitmaps greatly out perform btree. Jie and
others at Greenplum tested this implementation (and others) against very
large, low cardinality sets. I wasn't involved in it. I urge Jie and/or
Luke to present those results.

Thanks,

Gavin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2006-07-24 00:25:18 Re: On-disk bitmap index patch
Previous Message Tom Lane 2006-07-23 23:52:16 Re: Sun Donated a Sun Fire T2000 to the PostgreSQL