Quick Links

Re: proposal : cross-column stats

From:	tv(at)fuzzy(dot)cz
To:	"Nicolas Barbier" <nicolas(dot)barbier(at)gmail(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: proposal : cross-column stats
Date:	2010-12-24 12:15:00
Message-ID:	3e869cc4f3a74f9cddff605f57f32605.squirrel@sq.gransy.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

> 2010/12/24 Florian Pflug <fgp(at)phlo(dot)org>:
>
>> On Dec23, 2010, at 20:39 , Tomas Vondra wrote:
>>
>>> I guess we could use the highest possible value (equal to the number
>>> of tuples) - according to wiki you need about 10 bits per element
>>> with 1% error, i.e. about 10MB of memory for each million of
>>> elements.
>>
>> Drat. I had expected these number to come out quite a bit lower than
>> that, at least for a higher error target. But even with 10% false
>> positive rate, it's still 4.5MB per 1e6 elements. Still too much to
>> assume the filter will always fit into memory, I fear :-(
>
> I have the impression that both of you are forgetting that there are 8
> bits in a byte. 10 bits per element = 1.25MB per milion elements.

We are aware of that, but we really needed to do some very rough estimates
and it's much easier to do the calculations with 10. Actually according to
wikipedia it's not 10bits per element but 9.6, etc. But it really does not
matter if there is 10MB or 20MB of data, it's still a lot of data ...

Tomas

In response to

Re: proposal : cross-column stats at 2010-12-24 10:23:50 from Nicolas Barbier

Responses

Re: proposal : cross-column stats at 2010-12-24 13:06:38 from Tomas Vondra

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Florian Pflug	2010-12-24 12:37:29	Re: proposal : cross-column stats
Previous Message	Peter Eisentraut	2010-12-24 12:13:52	pgsql: Move the documentation of --no-security-label to a more sensible