Re: [GENERAL] contrib/levenshtein() has a bug?

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Ben <bench(at)silentmedia(dot)com>, PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: [GENERAL] contrib/levenshtein() has a bug?
Date: 2007-02-13 18:01:14
Message-ID: 200702131801.l1DI1E127975@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-patches

Tom Lane wrote:
> Ben <bench(at)silentmedia(dot)com> writes:
> > The levenshtein function from contrib/fuzzystrmatch.sql has a max arg
> > length of 255. OK, that's cool. But check this out:
>
> > mbrainz_db=> select max(length(name)) from public.track;
> > max
> > -----
> > 255
> > (1 row)
>
> > mbrainz_db=> select levenshtein(name,'foo') from public.track;
> > ERROR: argument exceeds max length: 255
>
> > That seems odd.
>
> length() measures in characters whereas the limit in question is being
> enforced in bytes. You got any multibyte characters in there?

I have updated the error message to mention bytes, attached.

> (It looks to me like levenshtein() is utterly non-multibyte-aware,
> which is probably a bug in itself.)

Is this a TODO?

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

Attachment Content-Type Size
/rtmp/diff text/x-diff 1.9 KB

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Ted Byers 2007-02-13 18:13:56 Re: Having a problem with my stored procedure
Previous Message semi-ambivalent 2007-02-13 17:32:38 Proper escaping for char(3) string, or PHP at fault, or me at fault?

Browse pgsql-patches by date

  From Date Subject
Next Message Bruce Momjian 2007-02-13 18:10:39 Re: Faster StrNCpy
Previous Message Bruce Momjian 2007-02-13 17:40:04 Re: tiny patch to make vacuumdb -a's database order match pg_dumpall