Re: Re: Efficiently determining the number of bits set in the contents of, a VARBIT field

From: Allan Kamau <kamauallan(at)gmail(dot)com>
To: Allan Kamau <allank(at)sanbi(dot)ac(dot)za>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: Re: Efficiently determining the number of bits set in the contents of, a VARBIT field
Date: 2009-03-12 18:10:21
Message-ID: ab1ea6540903121110l2a3021d4h6632b206e2419898@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Seems I have a solution based on the code TJ had provided for counting
the bits sets, for those interested below are the two functions.

#include "postgres.h"
#include "utils/varbit.h"
#include "fmgr.h"
#ifdef PG_MODULE_MAGIC
PG_MODULE_MAGIC;
#endif

PG_FUNCTION_INFO_V1(last_bit_set);
Datum
last_bit_set(PG_FUNCTION_ARGS)
{
/* position of last set bit of a bitstring? */
int n=0;
VarBit *a = PG_GETARG_VARBIT_P(0);
int i;
unsigned char *ap = VARBITS(a);
unsigned char aval;
int b=0;
int byte_cnt=0;
int last_bit_set_position=0;

for(i=0;i<VARBITBYTES(a);++i){
aval=*ap;++ap;
if(aval&128)
{
++n;
b=(8*byte_cnt)+1;
}
if(aval&64)
{
++n;
b=(8*byte_cnt)+2;
}
if(aval&32)
{
++n;
b=(8*byte_cnt)+3;
}
if(aval&16)
{
++n;
b=(8*byte_cnt)+4;
}
if(aval&8)
{
++n;
b=(8*byte_cnt)+5;
}
if(aval&4)
{
++n;
b=(8*byte_cnt)+6;
}
if(aval&2)
{
++n;
b=(8*byte_cnt)+7;
}
if(aval&1)
{
++n;
b=(8*byte_cnt)+8;
}
++byte_cnt;
}
last_bit_set_position=b;
PG_RETURN_INT32(last_bit_set_position);
}

#include "postgres.h"
#include "utils/varbit.h"
#include "fmgr.h"
#ifdef PG_MODULE_MAGIC
PG_MODULE_MAGIC;
#endif

PG_FUNCTION_INFO_V1(first_bit_set);
Datum
first_bit_set(PG_FUNCTION_ARGS)
{
/* position of first set bit of a bitstring? */
int n=0;
VarBit *a = PG_GETARG_VARBIT_P(0);
int i;
unsigned char *ap = VARBITS(a);
unsigned char aval;
int b=0;
int byte_cnt=0;
int first_bit_set_position=0;

for(i=0;b==0&&i<VARBITBYTES(a);++i){
aval=*ap;++ap;
if(aval&128)
{
++n;
b=1;
break;
}
if(aval&64)
{
++n;
b=2;
break;
}
if(aval&32)
{
++n;
b=3;
break;
}
if(aval&16)
{
++n;
b=4;
break;
}
if(aval&8)
{
++n;
b=5;
break;
}
if(aval&4)
{
++n;
b=6;
break;
}
if(aval&2)
{
++n;
b=7;
break;
}
if(aval&1)
{
++n;
b=8;
break;
}
++byte_cnt;
}
if(b>0)
first_bit_set_position=(8*byte_cnt)+b;
else
first_bit_set_position=0;
PG_RETURN_INT32(first_bit_set_position);
}

Allan.

On Thu, Mar 12, 2009 at 2:53 PM, Allan Kamau <allank(at)sanbi(dot)ac(dot)za> wrote:
> Hi all,
> I am now looking for a function to return the position of the first position
> of the left most set bit. And optionally another to return the position of
> the right most set bit.
>
> I have been looking at
> "http://graphics.stanford.edu/~seander/bithacks.html#OperationCounting" but
> it seems it will take me a while to figure out bit manipulation.
>
> Allan.
>
> Allan Kamau wrote:
>>
>> All was well with the code below, apologies to all who read my previous
>> email. The error (an oversight) was on my part. In the "CREATE FUNCTION ..."
>> statement I had FLOAT as the return type instead of INTEGER.
>> Now the function runs smoothly. Preliminary results show it is orders of
>> magnitude faster than the LENGTH(REGEXP(CAST(myVarBit AS TEXT),'0','','g'))
>> solution.
>> Thanks again TJ and the rest of the team.
>>
>> Allan
>>
>> Allan Kamau wrote:
>>>
>>> Thank you TJ and everyone else for the advise and the c code. Today I did
>>> finally return to the 'number of bits set challenge' and managed to compile
>>> and link the nbits c function which went smoothly. However the function does
>>> crash my postgres server installation (8.3.3) with a segmentation fault each
>>> time I call it for example SELECT nbits_set(B'1101');
>>> My C skills are very sparse and am unable to debug the function, I have
>>> included the C code of this function. Is there something I may have left
>>> out?
>>>
>>>
>>>
>>> #include "postgres.h"
>>> #include "utils/varbit.h"
>>> #include "fmgr.h"
>>> #ifdef PG_MODULE_MAGIC
>>> PG_MODULE_MAGIC;
>>> #endif
>>>
>>> PG_FUNCTION_INFO_V1(nbits_set);
>>> Datum
>>> nbits_set(PG_FUNCTION_ARGS)
>>> {
>>> /* how many bits are set in a bitstring? */
>>> VarBit *a = PG_GETARG_VARBIT_P(0);
>>> int n=0;
>>> int i;
>>> unsigned char *ap = VARBITS(a);
>>> unsigned char aval;
>>> for (i=0; i < VARBITBYTES(a); ++i) {
>>> aval = *ap; ++ap;
>>> if (aval == 0) continue;
>>> if (aval & 1) ++n;
>>> if (aval & 2) ++n;
>>> if (aval & 4) ++n;
>>> if (aval & 8) ++n;
>>> if (aval & 16) ++n;
>>> if (aval & 32) ++n;
>>> if (aval & 64) ++n;
>>> if (aval & 128) ++n;
>>> }
>>> PG_RETURN_INT32(n);
>>> }
>>>
>>> Allan
>>>
>>> Bruce Momjian wrote:
>>>>
>>>> Jean-David Beyer wrote:
>>>>
>>>>>
>>>>> TJ O'Donnell wrote:
>>>>>
>>>>>>
>>>>>> I use a c function, nbits_set that will do what you need.
>>>>>> I've posted the code in this email.
>>>>>>
>>>>>> TJ O'Donnell
>>>>>> http://www.gnova.com
>>>>>>
>>>>>> #include "postgres.h"
>>>>>> #include "utils/varbit.h"
>>>>>>
>>>>>> Datum nbits_set(PG_FUNCTION_ARGS);
>>>>>> PG_FUNCTION_INFO_V1(nbits_set);
>>>>>> Datum
>>>>>> nbits_set(PG_FUNCTION_ARGS)
>>>>>> {
>>>>>> /* how many bits are set in a bitstring? */
>>>>>>
>>>>>> VarBit *a = PG_GETARG_VARBIT_P(0);
>>>>>> int n=0;
>>>>>> int i;
>>>>>> unsigned char *ap = VARBITS(a);
>>>>>> unsigned char aval;
>>>>>> for (i=0; i < VARBITBYTES(a); ++i) {
>>>>>> aval = *ap; ++ap;
>>>>>> if (aval == 0) continue;
>>>>>> if (aval & 1) ++n;
>>>>>> if (aval & 2) ++n;
>>>>>> if (aval & 4) ++n;
>>>>>> if (aval & 8) ++n;
>>>>>> if (aval & 16) ++n;
>>>>>> if (aval & 32) ++n;
>>>>>> if (aval & 64) ++n;
>>>>>> if (aval & 128) ++n;
>>>>>> }
>>>>>> PG_RETURN_INT32(n);
>>>>>> }
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Hi all,
>>>>>>> Am looking for a fast and efficient way to count the number of bits
>>>>>>> set (to 1) in a VARBIT field. I am currently using
>>>>>>> "LENGTH(REGEXP_REPLACE(CAST(a.somefield_bit_code AS TEXT),'0','','g'))".
>>>>>>>
>>>>>>> Allan.
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> When I had to do that, in days with smaller amounts of RAM, but very
>>>>> long
>>>>> bit-vectors, I used a faster function sort-of like this:
>>>>>
>>>>> static char table[256] = {
>>>>> 0,1,1,2,1,2,2,3,1,.....
>>>>> };
>>>>>
>>>>> Then like above, but instead of the loop,
>>>>>
>>>>> n+= table[aval];
>>>>>
>>>>>
>>>>> You get the idea.
>>>>>
>>>>
>>>> Uh, I was kind of confused by this, even when I saw a full
>>>> implementation:
>>>>
>>>> http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetTable
>>>>
>>>> Actually, this looks even better:
>>>>
>>>>
>>>> http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetKernighan
>>>>
>>>>
>>>
>>>
>>
>>
>
>

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Lennin Caro 2009-03-12 18:54:30 Re: Permanent alias for postgresql table
Previous Message Duffer Do 2009-03-12 17:28:19 select count of all overlapping geometries and return 0 if none.