Re: Bug with Tsearch and tsvector

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Donald Fraser" <postgres(at)kiwi-fraser(dot)net>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "[BUGS]" <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Bug with Tsearch and tsvector
Date: 2010-04-26 18:19:52
Message-ID: 4BD592F80200002500030DF9@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> ie the critical point seems to be that url_path is willing to soak
> up a string containing "<" and ">", so the span tags don't get
> recognized as separate lexemes. While that's "obviously" the
> wrong thing in this particular example, I'm not sure if it's the
> wrong thing in general. Can anyone comment on the frequency of
> usage of those two symbols in URLs?

http://www.ietf.org/rfc/rfc2396.txt section 2.4.3 "delims" expressly
forbids their use in URIs.

> In any case it's weird that the URL lexeme doesn't span the same
> text as the url_path one, but I'm not sure which one we should
> consider wrong.

In spite of the above prohibition, I notice that firefox and wget
both seem to *try* to use such characters if they're included.

-Kevin

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2010-04-26 18:43:17 Re: Bug with Tsearch and tsvector
Previous Message Tom Lane 2010-04-26 14:55:16 Re: Bug with Tsearch and tsvector