Re: String Similarity

From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Mark Woodward <pgsql(at)mohawksoft(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: String Similarity
Date: 2006-05-20 04:30:09
Message-ID: Pine.GSO.4.63.0605200829180.29144@ra.sai.msu.su
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Get pg_trgm http://www.sai.msu.su/~megera/oddmuse/index.cgi/ReadmeTrgm
It doesn't depends on language.

Oleg
On Fri, 19 May 2006, Mark Woodward wrote:

> I have a side project that needs to "intelligently" know if two strings
> are contextually similar. Think about how CDDB information is collected
> and sorted. It isn't perfect, but there should be enough information to be
> usable.
>
> Think about this:
>
> "pink floyd - dark side of the moon - money"
> "dark side of the moon - pink floyd - money"
> "money - dark side of the moon - pink floyd"
> etc.
>
> To a human, these strings are almost identical. Similarly:
>
> "dark floyd of money moon pink side the"
>
> Is a puzzle to be solved by 13 year old children before the movie starts.
>
> My post has three questions:
>
> (1) Does anyone know of an efficient and numerically quantified method of
> detecting these sorts of things? I currently have a fairly inefficient and
> numerically bogus solution that may be the only non-impossible solution
> for the problem.
>
> (2) Does any one see a need for this feature in PostgreSQL? If so, what
> kind of interface would be best accepted as a patch? I am currently
> returning a match liklihood between 0 and 100;
>
> (3) Is there also a desire for a Levenshtein distence function for text
> and varchars? I experimented with it, and was forced to write the function
> in item #1.
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Stark 2006-05-20 04:54:35 Re: text_position worst case runtime
Previous Message Mark Woodward 2006-05-20 01:00:51 Re: String Similarity