Quick Links

Bloom Filter indexes?

From:	Gregory Maxwell <gmaxwell(at)gmail(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Bloom Filter indexes?
Date:	2005-05-28 22:25:24
Message-ID:	e692861c050528152528230b41@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Has any thought been given to adding bloom filter indexes to PostgreSQL?

A bloom index would be created on a column, and could then be used to
accelerate exact matches where it is common that the user may query
for a value that doesn't exist. For example, with the query select
userid from user_table where name="notauser", the failure could be
returned instantly, in most cases.

A bloom filter index could be used to accelerate joins, esp full outer joins.

Insertions into a bloom filter are very cheap. Updates could be done
as an insert. Deletes are expensive (either you make a refcounted
filter or you regenerate the filter). However, since bloom filters
have false positives, it would be acceptable to regenerate the filter
during a vacuum if there have been entries deleted or updated. The
filter could be resized at vacuum time based on statistics gathered
during execution.

It would also be useful to have an array bloom index: store a bloom
filter per record for an arrayed field, as well as the bloom filter
for all records. This would allow membership tests for a field
containing large arrays to happen very quickly. Perhaps useful for GIS
and full text indexing applications.

Responses

Bloom Filter indexes? at 2005-05-29 01:29:38 from Gregory Maxwell
Re: Bloom Filter indexes? at 2005-05-29 06:26:09 from Oleg Bartunov

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Gregory Maxwell	2005-05-29 01:29:38	Bloom Filter indexes?
Previous Message	Tom Lane	2005-05-28 20:40:38	Inefficiency in recent pgtz patch