Re: WIP: BRIN bloom indexes

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Sokolov Yura <y(dot)sokolov(at)postgrespro(dot)ru>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: BRIN bloom indexes
Date: 2017-10-27 20:09:16
Message-ID: 0f2b84e6-6c0c-689f-1a4a-4186b511e437@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 10/27/2017 05:22 PM, Sokolov Yura wrote:
>
> Hi, Tomas
>
> BRIN bloom index is a really cool feature, that definitely should be in
> core distribution (either in contrib or builtin)!!!
>
> Small suggestion for algorithm:
>
> It is well known practice not to calculate whole hash function for every
> point, but use double hashing approach.
> Example of proving double hashing approach for bloom filters:
> https://www.eecs.harvard.edu/~michaelm/postscripts/rsa2008.pdf
>
> So, instead of:
>     for (i = 0; i < filter->nhashes; i++)
>     {
>         /* compute the hashes with a seed, used for the bloom filter */
>         uint32 h = ((uint32)DatumGetInt64(hash_uint32_extended(value,
> i))) % filter->nbits;
>
>         /* set or check bit h */
>     }
>
> such construction could be used:
>     /* compute the hashes, used for the bloom filter */
>     uint32 big_h = ((uint32)DatumGetInt64(hash_uint32_extended(value, i)));
>     uint32 h = big_h % filter->nbits;
>     /* ensures d is never >= filter->nbits */
>     uint32 d = big_h % (filter->nbits - 1);
>
>     for (i = 0; i < filter->nhashes; i++)
>     {
>         /* set or check bit h */
>
>         /* compute next bit h */
>         h += d++;
>         if (h >= filter->nbits) h -= filter->nbits;
>         if (d == filter->nbits) d = 0;
>     }
>
> Modulo of one 64bit result by two coprime numbers (`nbits` and `nbits-1`)
> gives two good-quality functions usable for double hashing.
>

OK, thanks for the suggestion.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-10-27 20:18:06 Re: Proposal: Local indexes for partitioned table
Previous Message Tomas Vondra 2017-10-27 20:06:58 Re: WIP: BRIN bloom indexes