Re: Fuzzy matching?

From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Joe Conway <joseph(dot)conway(at)home(dot)com>, Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, <pgsql-sql(at)postgresql(dot)org>
Subject: Re: Fuzzy matching?
Date: 2001-08-02 16:16:29
Message-ID: Pine.GSO.4.33.0108021911500.27718-100000@ra.sai.msu.su
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches pgsql-sql

I have C-codes which implement Levenstein Distance Algorithms
Here is a header for soe info:

// Levenstein Distance Algorithms Module
//
// Contains functions to do "Inexact Alphanumeric Comparisons". The original
// concept for this code came from "The C Users Journal", May 1991, starting
// on page 127. The author was Hans Z. Zwakenberg.
//
// These routines have been modified so they do not need to use double
// subscripted arrays in the CUJ code. An additional function was created to
// "look up" a word from a list, returning the most likely matches.
//
// Written initailly by R. Bruce Roberts, MCI Systems Engineering, 1/7/93
// Compiled with Borland C++, V3.1
//

It's compiled and works well. Don't know about license but it's published
and it's possible to ask author directly

zen:~/app/algo/dist/ldist$ ./test simple sample
Returns 'Levenstein Distance' of two strings on command line.
Determining Distance for simple and sample.
Length is limited to 40 characters.
Words are alike, L Distance is 1, Threshold was 3.

PLease let me know if you need it. I thought about implementing the
same feature as Google has into our OpenFTS search

Regards,

Oleg
On Tue, 31 Jul 2001, Tom Lane wrote:

> "Josh Berkus" <josh(at)agliodbs(dot)com> writes:
> > I'm quite interested, myself. How difficult is it for somebody that
> > doesn't program C to attach a function from the Contrib directory?
>
> Run the install script.
>
> > If it's not very difficult, then I'd recommend putting metaphone in
> > /contrib, and levenstein in the backend. My reasoning is that
> > levenstein is useful for all roman alphabets, but metaphone is not so
> > useful for non-english versions of postgres.
>
> Our usual practice with stuff of uncertain usefulness has been to stick
> it in contrib for awhile and see if anyone uses it. If there's
> sufficient interest, we'll promote it to mainstream in a future release.
>
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://www.postgresql.org/search.mpl
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

In response to

Browse pgsql-patches by date

  From Date Subject
Next Message Bruce Momjian 2001-08-02 16:29:34 Re: Patch for Improved Syntax Error Reporting
Previous Message Tom Lane 2001-08-02 16:11:01 Re: WIN32 errno patch

Browse pgsql-sql by date

  From Date Subject
Next Message Lorenzo De Vito 2001-08-02 16:23:54 Foreign key
Previous Message Gary Stainburn 2001-08-02 15:56:26 Re: where'd the spaces come from