Bloom filters bloom filters bloom filters

From: Greg Stark <stark(at)mit(dot)edu>
To: "<pgsql-hackers(at)postgresql(dot)org>" <pgsql-hackers(at)postgresql(dot)org>
Subject: Bloom filters bloom filters bloom filters
Date: 2010-01-18 13:23:21
Message-ID: 407d949e1001180523je6e9cddne537dc8d47b6682@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Another idea of how to make use of bloom filters:

It should be possible to implement a gist opclass over a bloomfilter
data type which implements the operation int membership in int[].

All you need is a function which takes an int[] and returns a bloom
filter. Then your union operation is to just bitwise or the two bloom
filters. Your penalty function would be the number of bits required to
set in the filter which aren't already set. You could even construct
different size bloom filters depending on how large the original set
is. When you union two different size filters you need to expand the
smaller ones before doing the union but that's an easy operation.

I recall gist has a compression function already for storing
signatures of large data types for lossy indexing. Have I just
described how that works already? I don't recall seeing anything like
this in there currently.

--
greg

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2010-01-18 13:25:07 Re: New XLOG record indicating WAL-skipping
Previous Message Greg Stark 2010-01-18 13:13:18 Re: Bloom index