Re: BK-Tree Implementation on top of GiST

From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Volkan YAZICI <yazicivo(at)ttnet(dot)net(dot)tr>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: BK-Tree Implementation on top of GiST
Date: 2007-10-28 17:01:01
Message-ID: Pine.LNX.4.64.0710282000350.14368@sn.sai.msu.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Have you seen contrib/pg_trgm module ?

Oleg
On Sun, 28 Oct 2007, Volkan YAZICI wrote:

> Hi,
>
> In an address search framework of a company, we need to deal with
> queries including potential spelling errors. After some external
> address space constraints (e.g. match first letters, word length,
> etc.) we're still ending up with a huge data set to filter through
> Levenshtein like distance metrics.
>
> Sequential scanning a record set with roughly 100,000 entries through
> some sort of distance metric is not something we'd want in
> production. For this purpose, I plan to implement BK-Trees[1] on top
> of GiST, which will (technically) reduce overhead complexity from O(n)
> to O(logn). As far as I'm concerned, such a work will worth the time
> it will take when compared to overhead reduction it will bring.
>
> [1] Some approaches to best-match file searching
> http://portal.acm.org/citation.cfm?id=362003.362025
>
> Anyway, I have some experience with source code of intarray
> module. Does anybody have suggestions/warnings/comments about such a
> project? Would PostgreSQL team welcome such a patch to get integrated
> into fuzzystrmatch module, or should I create my own project at
> pgfoundry?
>
> BTW, as you'd imagine, related implementation won't be something
> specific to Levenshtein. Any distance metric on any kind of data will
> be able to benefit from BK-Trees.
>
>
> Regards.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2007-10-28 17:21:49 Re: Backend misfeasance for DEFAULT NULL
Previous Message Tom Lane 2007-10-28 16:44:04 Backend misfeasance for DEFAULT NULL