Re: Select all invalid e-mail addresses

From: "Dann Corbit" <DCorbit(at)connx(dot)com>
To: "Steve Atkins" <steve(at)blighty(dot)com>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: Select all invalid e-mail addresses
Date: 2005-10-20 20:19:21
Message-ID: D425483C2C5C9F49B5B7A41F8944154757D219@postal.corporate.connx.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Interesting article:
http://coveryourasp.com/ValidateEmail.asp

See also:
http://search.cpan.org/~cwest/Email-Address-1.80/lib/Email/Address.pm
http://www.faqs.org/rfcs/rfc2822.html
http://docs.python.org/lib/module-rfc822.html

> -----Original Message-----
> From: pgsql-general-owner(at)postgresql(dot)org [mailto:pgsql-general-
> owner(at)postgresql(dot)org] On Behalf Of Steve Atkins
> Sent: Thursday, October 20, 2005 12:35 PM
> To: pgsql-general(at)postgresql(dot)org
> Subject: Re: [GENERAL] Select all invalid e-mail addresses
>
> On Thu, Oct 20, 2005 at 11:52:40AM -0400, Andrew Sullivan wrote:
> > On Thu, Oct 20, 2005 at 06:10:40PM +0300, Andrus wrote:
> > > >From this thread I got the regular expression
> >
> > [snipped]
> >
> > Note that that regular expression, which appears to be validating
> > TLDs as well, is incredibly fragile. John Klensin has actually
> > written an RFC about this very problem. Among other problems, what
> > do you do when a country code ceases to be? (There's a similar
> > problem that the naming bodies struggke with from time to time.)
> >
> > I suggest that if you want to validate TLDs, you pull them off when
> > you write the data in your database, and use a lookup table to make
> > sure they're valid (you can keep the table up to date regularly by
> > checking the official IANA registry for them). At least that way
you
> > don't have to change a regex every time ICANN decides to add another
> > TLD.
>
> You need to maintain the data, certainly. To argue that it must
> be in a table to be maintained is, well, wrong. My preference would
> be to keep it in a table and regenerate the regex periodically, and
> in the application layer I do exactly that, but to try and do that
> in a check constraint would be painful. A cleaner approach would
> be to have a regex that checks for general syntax and extracts the
> TLD, which is then compared to a lookup table, perhaps, but that
> adds a lot of complexity for no real benefit.
>
> > (The regex is wrong anyway, I think: it doesn't have .mobi,
> > which has been announced although isn't taking registrations yet,
and
> > it doesn't appear to have arpa, either.)
>
> While there are valid deliverable email addresses in .arpa, you really
> don't want to be accepting them from end users...
>
> Cheers,
> Steve
>
> ---------------------------(end of
broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
> http://archives.postgresql.org

Browse pgsql-general by date

  From Date Subject
Next Message Doug Quale 2005-10-20 20:29:22 Re: [pgsql-advocacy] Oracle buys Innobase
Previous Message Chris Travers 2005-10-20 20:17:56 Re: Oracle and PostgreSQL...