Friday, 8 February 2013

Validating an email address (bloody apostrophe)

You know what really gets me angry, websites which don't validate email address properly! I have an apostrophe in my surname O'Hanlon and often employers will give me the email address similar to martin.o'hanlon@mycompany.com, all well and good a perfectly respectable email address, which is reflective of my name.

Then this happens:

Oracle Account Registration
This isn't just small company's and little custom website, this issue is prevalent on some BIG companies websites.

It all comes down to regular expressions, which are a standard way of defining and validating a format, and for years when you googled 'email address regular expression', and clearly a lot of people did because the problem is everywhere, you ended up with this:

[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}

And its wrong! Its a simple email address regular expression which is missing all sorts of 'unusual' characters / & % and most annoying for me '.

If you want to be correct against the RFC definition of an email address you need this(!):

(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08
\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?
:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-
5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-
z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0
b\x0c\x0e-\x7f])+)\])

Although I would be happier, im sure a lot of the other Irish descendants, with this:

[A-Z0-9.'_%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}

Bloody apostophe!

For more information and a lot more background about validating email addresses head over to http://www.regular-expressions.info/email.html

28/2/2013 - Found another SiteCore


14/06/2013 - And another, Prometric



8 comments:

  1. This is a good tool to validate email addresses in .net:
    http://www.kellermansoftware.com/p-37-net-email-validation.aspx

    ReplyDelete
  2. Just to add that _actually_ even that's not quite the whole story. This one is a closer match the to RFC but still doesn't handle nested comments (which are valid in an email address - wtf?)...

    [I tried to paste but hit the size limit - see here: http://www.ex-parrot.com/pdw/Mail-RFC822-Address.html ]

    ReplyDelete
  3. Thanks Martin - just got me out of a hole at work ;-)

    ReplyDelete
    Replies
    1. Excellent, presumably an apostrophe in email sized hole?

      Delete
  4. This is terrible advice. While it's technically correct that you can have an email address with an apostrophe, doing so is a bad idea. That's because many email servers will not deliver mail to your address because those servers think the apostrophe is invalid.

    So feel free to have an email address with an apostrophe, but don't complain when you only receive about half the emails sent to you.

    I, for one, will spare my users that headache by not allowing them to create email addresses with an apostrophe. You know, like the BIG companies do.

    ReplyDelete
    Replies
    1. It seems to me that you need better email servers. If someone can have an apostrophe in their email address it should be supported.

      Its the validation of web sites which is at fault here not email providers, they are more than happy with an apostrophe and lots of other 'non character' letters.

      Delete
    2. The email spec allows for an apostrophe, so websites should accommodate it period (I mean ' ).

      Delete
  5. This comment has been removed by the author.

    ReplyDelete

Note: only a member of this blog may post a comment.