This is *not* a novice question. I'm not sure where else you'd post it
> Ok, the basic question: does anyone have any approximate string
> algorithms coded such that PostgreSQL can use it effeciently? I would
> to handle inserts/deletes. I already have a perl and LotusScript
> for Domino) implementation but I haven't ever been able to get the
> module to install right with PostgreSQL.
Metaphone, Soundex, and Levenshtein were built for postgresql by Joe
Conway. Find them in the /contrib directory.
> Wu-Manber k-differences: it's an algorithm that measures how many
> are required to turn one string into another. k is the number of
> This is also known as the Levenschtein distance. I'm getting this
> from the
> Perl Algorithm book.
Levenschtien is available in /contrib. It works well for the database
I use it on; though that only has 7000 records, so you'll have to test
really large tables.
If you're deduplicating, I wrote a sophisticated name-alike function
using Levenschtein and Metaphone in PL/pgSQL and posted it to Roberto
Mello's function library (accessable from TechDocs).
In response to
pgsql-novice by date
|Next:||From: Daniel Grob||Date: 2002-03-21 08:47:37|
|Subject: rules over multiple tables|
|Previous:||From: Joshua b. Jore||Date: 2002-03-20 23:07:51|
|Subject: Re: Approximate string matching?|