From: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
---|---|
To: | Sokolov Yura <y(dot)sokolov(at)postgrespro(dot)ru> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: WIP: BRIN bloom indexes |
Date: | 2017-10-27 20:09:16 |
Message-ID: | 0f2b84e6-6c0c-689f-1a4a-4186b511e437@2ndquadrant.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 10/27/2017 05:22 PM, Sokolov Yura wrote:
>
> Hi, Tomas
>
> BRIN bloom index is a really cool feature, that definitely should be in
> core distribution (either in contrib or builtin)!!!
>
> Small suggestion for algorithm:
>
> It is well known practice not to calculate whole hash function for every
> point, but use double hashing approach.
> Example of proving double hashing approach for bloom filters:
> https://www.eecs.harvard.edu/~michaelm/postscripts/rsa2008.pdf
>
> So, instead of:
> for (i = 0; i < filter->nhashes; i++)
> {
> /* compute the hashes with a seed, used for the bloom filter */
> uint32 h = ((uint32)DatumGetInt64(hash_uint32_extended(value,
> i))) % filter->nbits;
>
> /* set or check bit h */
> }
>
> such construction could be used:
> /* compute the hashes, used for the bloom filter */
> uint32 big_h = ((uint32)DatumGetInt64(hash_uint32_extended(value, i)));
> uint32 h = big_h % filter->nbits;
> /* ensures d is never >= filter->nbits */
> uint32 d = big_h % (filter->nbits - 1);
>
> for (i = 0; i < filter->nhashes; i++)
> {
> /* set or check bit h */
>
> /* compute next bit h */
> h += d++;
> if (h >= filter->nbits) h -= filter->nbits;
> if (d == filter->nbits) d = 0;
> }
>
> Modulo of one 64bit result by two coprime numbers (`nbits` and `nbits-1`)
> gives two good-quality functions usable for double hashing.
>
OK, thanks for the suggestion.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2017-10-27 20:18:06 | Re: Proposal: Local indexes for partitioned table |
Previous Message | Tomas Vondra | 2017-10-27 20:06:58 | Re: WIP: BRIN bloom indexes |