Re: proposal : cross-column stats

From: tv(at)fuzzy(dot)cz
To: "Nicolas Barbier" <nicolas(dot)barbier(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: proposal : cross-column stats
Date: 2010-12-24 12:15:00
Message-ID: 3e869cc4f3a74f9cddff605f57f32605.squirrel@sq.gransy.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> 2010/12/24 Florian Pflug <fgp(at)phlo(dot)org>:
>
>> On Dec23, 2010, at 20:39 , Tomas Vondra wrote:
>>
>>>   I guess we could use the highest possible value (equal to the number
>>>   of tuples) - according to wiki you need about 10 bits per element
>>>   with 1% error, i.e. about 10MB of memory for each million of
>>>   elements.
>>
>> Drat. I had expected these number to come out quite a bit lower than
>> that, at least for a higher error target. But even with 10% false
>> positive rate, it's still 4.5MB per 1e6 elements. Still too much to
>> assume the filter will always fit into memory, I fear :-(
>
> I have the impression that both of you are forgetting that there are 8
> bits in a byte. 10 bits per element = 1.25MB per milion elements.

We are aware of that, but we really needed to do some very rough estimates
and it's much easier to do the calculations with 10. Actually according to
wikipedia it's not 10bits per element but 9.6, etc. But it really does not
matter if there is 10MB or 20MB of data, it's still a lot of data ...

Tomas

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Florian Pflug 2010-12-24 12:37:29 Re: proposal : cross-column stats
Previous Message Peter Eisentraut 2010-12-24 12:13:52 pgsql: Move the documentation of --no-security-label to a more sensible