Re: On-disk bitmap index patch

From: Mark Kirkwood <markir(at)paradise(dot)net(dot)nz>
To: Jie Zhang <jzhang(at)greenplum(dot)com>
Cc: Hannu Krosing <hannu(at)skype(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>, pgsql-hackers(at)postgresql(dot)org, Luke Lonergan <LLonergan(at)greenplum(dot)com>
Subject: Re: On-disk bitmap index patch
Date: 2006-07-25 03:54:48
Message-ID: 44C59608.6030207@paradise.net.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Jie Zhang wrote:
>
> On 7/24/06 6:59 AM, "Hannu Krosing" <hannu(at)skype(dot)net> wrote:
>
>>
>>
>> And also for AND-s of several indexes, where indexes are BIG, your btree
>> indexes may be almost as big as tables but the resulting set of pages is
>> small.
>
> Yeah, Hannu points it out very well -- the bitmap index works very well when
> columns have low cardinalities and AND operations will produce small number
> of results.
>
> Also, the bitmap index is very small in low cardinality cases, where the
> btree tends to take up at least 10 times more space.
>
>

The smallness of the bitmap index also means that some queries will
require much less work_mem to achieve good performance e.g consider:

TPCH dataset with scale factor 10 on my usual PIII HW, query -
select count(*) from lineitem where l_linenumber=1;

This executes in about 100 seconds with work_mem = 20M if there is a
bitmap index on l_linenumber. It takes 3832 seconds (!) if there is a
btree index on the same column. Obviously cranking up work_mem will even
up the difference (200M gets the btree to about 110 seconds), but being
able to get good performance with less memory is a good thing!

Cheers

Mark

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2006-07-25 03:59:34 Re: Time zone definitions to config files
Previous Message Tom Lane 2006-07-25 03:51:23 pgsql: Remove hard-wired lists of timezone abbreviations in favor of