Scoring

From: "Eric Jain" <jain(at)gmx(dot)net>
To: <pgsql-general(at)postgresql(dot)org>
Subject: Scoring
Date: 2000-07-05 21:04:13
Message-ID: NCBBJFHBEGOIAHBCBNCLAEJACIAA.jain@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Any tips on how to efficiently score fields with altavista-like query
strings?

I currently use the following PL/Perl function, which unfortunatly is
rather slow, even though I have already simplified it quite a bit...

# Example: SELECT id, score(description, 'a?pha -"beta gamma"') FROM
table;

CREATE FUNCTION score(TEXT, VARCHAR) RETURNS INT2 AS
'
my @regex = ();
my $score = 0;

$_[1] =~ s{(-)?"(.+?)"}{ push(@regex, $1 . $2); () }egs;
push(@regex, split(/\\s/, $_[1]));

foreach (@regex)
{
s/\\?/\\./g;

if (s/^-//)
{
return 0 if ($_[0] =~ /\\b$_/i);
}

else
{
my @matches = ();
@matches = $_[0] =~ /\\b$_/gi;
return 0 unless scalar @matches;
$score += scalar @matches or return 0;
}
}

return $score;
'
LANGUAGE 'plperl';

--
Eric Jain

Browse pgsql-general by date

  From Date Subject
Next Message Jan Wieck 2000-07-05 21:04:45 Re: [HACKERS] Re: Revised Copyright: is this morepalatable?
Previous Message Tom Lane 2000-07-05 21:01:32 Re: Need help with error