Re: Re: [BUGS] BUG #5021: ts_parse doesn't recognize email addresses with underscores

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, "Dan O'Hara" <danarasoftware(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Subject: Re: Re: [BUGS] BUG #5021: ts_parse doesn't recognize email addresses with underscores
Date: 2010-03-13 01:36:55
Message-ID: 201003130136.o2D1atT04906@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

Tom Lane wrote:
> Bruce Momjian <bruce(at)momjian(dot)us> writes:
> > Well, I think the big question is whether we need to honor RFC 5322
> > (http://www.rfc-editor.org/rfc/rfc5322.txt). Wikipedia says these are
> > all valid characters:
>
> > http://en.wikipedia.org/wiki/E-mail_address
>
> > * Uppercase and lowercase English letters (a-z, A-Z)
> > * Digits 0 to 9
> > * Characters ! # $ % & ' * + - / = ? ^ _ ` { | } ~
> > * Character . (dot, period, full stop) provided that it is not the
> > first or last character, and provided also that it does not appear two
> > or more times consecutively.
>
> That's an awful lot of special characters. For the RFC's purposes,
> it's not hard to be flexible because in an email message there is
> external context telling where to expect an address. I think if we
> tried to allow all of those in email addresses in tsearch, we'd have
> "email addresses" gobbling up a whole lot of adjacent text, to nobody's
> benefit.
>
> I can see the case for adding "+" because that's fairly common as Alvaro
> notes, but I think we should be very circumspect about going farther.

OK, I can add '+' using Teodor's patch as a guide, and document which
characters we support, and that we don't support all of them. What
about the binary upgrade issue? I am now worried that maybe we should
back out the patch and just document our restrictions.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

PG East: http://www.enterprisedb.com/community/nav-pg-east-2010.do

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Steve Atkins 2010-03-13 02:05:31 Re: Re: [BUGS] BUG #5021: ts_parse doesn't recognize email addresses with underscores
Previous Message Tom Lane 2010-03-13 01:18:36 Re: Re: [BUGS] BUG #5021: ts_parse doesn't recognize email addresses with underscores

Browse pgsql-hackers by date

  From Date Subject
Next Message Steve Atkins 2010-03-13 02:05:31 Re: Re: [BUGS] BUG #5021: ts_parse doesn't recognize email addresses with underscores
Previous Message Tom Lane 2010-03-13 01:18:36 Re: Re: [BUGS] BUG #5021: ts_parse doesn't recognize email addresses with underscores