Re: scoring differences between bitmasks

From: Ben <bench(at)silentmedia(dot)com>
To: "Todd A(dot) Cook" <tcook(at)blackducksoftware(dot)com>
Cc: Postgresql-General mailing list <pgsql-general(at)postgresql(dot)org>
Subject: Re: scoring differences between bitmasks
Date: 2005-10-02 18:49:03
Message-ID: F9149CFE-05C3-40D1-A6D8-58592A796B6E@silentmedia.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Just the number of bits, not which ones. Basically, the hamming
distance.

On Oct 2, 2005, at 11:44 AM, Todd A. Cook wrote:

> Hi,
>
> It may be that I don't understand your problem. :)
>
> Are you searching the table for the closest vector? If so, is
> "closeness" defined only as the number of bits that are different?
> Or, do you need to know which bits as well?
>
> -- todd
>
>
> Ben wrote:
>
>> Hrm, I don't understand. Can you give me an example with some
>> reasonably sized vectors?
>> On Oct 2, 2005, at 10:59 AM, Todd A. Cook wrote:
>>
>>> Hi,
>>>
>>> Try breaking the vector into 4 bigint columns and building a
>>> multi- column
>>> index, with index columns going from the most evenly distributed
>>> to the
>>> least. Depending on the distribution of your data, you may only
>>> need 2
>>> or 3 columns in the index. If you can cluster the table in that
>>> order,
>>> it should be really fast. (This structure is a tabular form of
>>> a linked
>>> trie.)
>>>
>>> -- todd
>>>
>>>
>>> Ben wrote:
>>>
>>>
>>>> Yes, that's the straightforward way to do it. But given that
>>>> my vectors are 256 bits in length, and that I'm going to
>>>> eventually have about 4 million of them to search through, I
>>>> was hoping greater minds than mine had figured out how to do
>>>> it faster, or how compute some kind of indexing....... somehow.
>>>>
>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Todd A. Cook 2005-10-02 19:14:12 Re: scoring differences between bitmasks
Previous Message Todd A. Cook 2005-10-02 18:44:37 Re: scoring differences between bitmasks