From: | Teodor Sigaev <teodor(at)sigaev(dot)ru> |
---|---|
To: | Greg Stark <gsstark(at)mit(dot)edu> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Bloom index |
Date: | 2010-01-18 12:47:39 |
Message-ID: | 4B54586B.9000904@sigaev.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> So for example if your bloom filter is 4 bits per column your error
> rate for a single column where clause will be 1/2^(4/1.44) or a little
> worse than returning 1 record in 7. If you test two or three columns
> then it'll be about 1 in 49 or 1 in 343.
Hmm, I don't understand your calculations. In your example (4 bits per column,
index has only one column) each row is hashed by 4 bits in 80-bit vector
(signature), so, probability of collision is (1/80)^4
>
> Also, as it stands now it's an unordered list. I assume you have no
> plans to implement vacuum for this? Perhaps it would be better to
> store a btree of signatures ordered by htid. That would let you easily
> vacuum dead entries. Though it would necessitate random seeks to do
> the full scan of the index unless you can solve the full index scan
> concurrent with page splits problem.
Vacuum is already implemented by producing a full scan of index like all other
indexes.
--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/
From | Date | Subject | |
---|---|---|---|
Next Message | Greg Stark | 2010-01-18 13:13:18 | Re: Bloom index |
Previous Message | Heikki Linnakangas | 2010-01-18 12:17:57 | Re: New XLOG record indicating WAL-skipping |