Re: Index usage when bitwise operator is used

From: "Valentine Gogichashvili" <valgog(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Cc: "W(dot)Alphonse HAROUNY" <wharouny(at)gmail(dot)com>
Subject: Re: Index usage when bitwise operator is used
Date: 2007-09-16 10:08:33
Message-ID: 3ce9822f0709160308v59c8a4i8dd93503ddfddf6c@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

I could not find and normal solution for that issue. But I am using some
workarounds for that issue.

The solution, that I am using now is to create an index for every bit of
your bitmap field.

So something like

CREATE INDEX idx_hobbybit_0_limited
ON "versionA".user_fast_index
USING btree
(gender, dateofbirth) -- here the gender and dateofbirth fields are the
fields that we usually ORDER BY in the select statements, but you can
play with the needed fields
WHERE (hobby_bitmap & 1) > 0;

by creating such an index for every used bit and combining WHERE
(hobby_bitmap & 1 ) > 0 like statements the planner will be choosing the
right index to use.

Another workaround, that will be more applicable in your case I think, is to
create a functional GIN index on your bitmap field using a static function
to create an array of bitmap keys from your bitmap field.

CREATE OR REPLACE FUNCTION "versionA".bitmap_to_bit_array(source_bitmap
integer)
RETURNS integer[] AS
'select ARRAY( select (1 << s.i) from generate_series(0, 32) as s(i) where (
1 << s.i ) & $1 > 0 )'
LANGUAGE 'sql' IMMUTABLE STRICT;

And than create a GIN index on the needed field using this stored procedure.
After that, it would be possible to use intarray set operators on the result
of that function. This will also make it possible to use that GIN index.

Actually it would be much much better if it were possible to build GIN
indexes directly on the bitmap fields. But this is to be implemented by GIN
and GiST index development team. Probably would be not a bad idea to make a
feature request on them.

With best regards,

Valentine Gogichashvili

On 9/13/07, W.Alphonse HAROUNY <wharouny(at)gmail(dot)com> wrote:
>
> Hello,
>
> My question is about index usage when bitwise operations are invoked.
> Situation Context:
> --------------------------
>
> Lets suppose we have 2 tables TBL1 and TBL2 as the following:
> TBL1 {
> ......... ;
> integer categoryGroup; // categoryGroup is declared as an index on TABL1
> ......... ;
> }
>
> TBL2 {
> ......... ;
> integer categoryGroup; // categoryGroup is declared as an index on TABL2
> ......... ;
> }
>
> By conception, I suppose that:
> - [categoryGroup] may hold a limited number of values, less than 32
> values.
> - [categoryGroup] is of type integer => it means 4 bytes => 32 bits
> => 32 places available to hold binary '0' or binary '1' values.
> - [categoryGroup] is the result of an "OR bitwise operation" among a
> predefined set of variables [variableCategory].
> We suppose that [variableCategory] is of type integer (=>32 bits)
> and each binary value of [variableCategory] may only hold a single
> binary '1'.
>
>
> Ex: variableCategory1 = 00000000000000000000000000000010
> variableCategory2 = 00000000000000000000000000100000
> variableCategory3 = 00000000000000000000000000001000
>
> If [categoryGroup] = variableCategory1 | variableCategory2 |
> variableCategory3
> =>[categoryGroup] = 00000000000000000000000000101010
>
>
>
> Question:
> --------------
> I have an SQL request similar to:
>
> SELECT ..... FROM TBL1, TBL2 WHERE
> <inner join between TBL1 and TBL2 is True> AND
> TBL1.CATEGORY & TBL2.CATEGORY <> 0 //-- where & is the AND bitwise
> operator
>
> Qst:
> 1/ IS the above SQL request will use the INDEX [categoryGroup] defined on
> TBL1 and TBL2 ?
> 2/ What should I do or How should I modify my SQL request in order
> to force the query engine to use an index ? (the already defined index
> or another useful index)
>
>
>
> Thx a loт
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Kevin Grittner 2007-09-16 20:00:48 Re: random_page_costs - are defaults of 4.0 realistic for SCSI RAID 1
Previous Message Alan Hodgson 2007-09-14 20:29:01 Re: Index files