My Speaking Schedule for Oracle Open World 2014

A quick post to let you know about the two presentations that I’ll be doing at Oracle Open World 2014.

Session ID:         UGF4482
Session Title:     “Getting Started with SQL Pattern Matching in Oracle Database 12c
Venue / Room:  Moscone South – 301
Date and Time:  9/28/14, 13:30 – 14:15

Session ID:          CON4493
Session Title:      “Regular Expressions in Oracle Database 101”
Venue / Room:   Moscone South – 303
Date and Time:   10/2/14, 13:15 – 14:00

As usual, you might have to check before the session to make sure the room has not changed.

I hope to see you there.

An Attempt to Validate Email Addresses with a Regular Expression

After my Kscope13 presentation, one of the attendees approached me and asked if I had used regular expressions to try to validate email addresses. He didn’t know about regular expressions before the presentation and he was writing a PL/SQL function (which was getting very complex…) to accomplish this particular task. I offered to give it a try and blog about it.

After some testing, this is the regular expression that I came up with: ^[a-z0-9._-]+@[a-z0-9.-]+\.[a-z]{2,3}$

So basically, I consider an email address to be valid (that’s my definition for this example) if it begins with a sequence of characters that may include any letters, any digits, the dot, the underscore or the dash followed by the @ and then followed by a sequence of characters that may include any letters, any digits, the dot or the dash followed by a dot and ending with a sequence of characters that contains a minimum of 2 and a maximum of 3 letters.

I know there’re more valid characters that can be used to define an email address but I chose the ones that are used most of the time. Another thing to keep in mind is that this regular expression is case sensitive, so in order to match email addresses that contain capital letters I’ll rely on the parameter that can be passed to the regexp SQL functions to make the match case insensitive.

So lets take a look at some sample data stored in the clients table:

SQL> select * from clients;

NAME_CLIENT               PHONE                EMAIL
------------------------- -------------------- ------------------------------
FOUNTAIN,VICKIE L                    
ORTIZ,GENOVEVA M                     
BALDA GALO                (512) 254-1694

6 rows selected.

As you can see, there’re email addresses that match the pattern (1 and 6) and other that don’t match the pattern (2,3).

Lets find the records that match the pattern first:

SQL> select *
  2 from clients
  3 where regexp_like (email,'^[a-z0-9._-]+@[a-z0-9.-]+\.[a-z]{2,3}$','i');

NAME_CLIENT               PHONE                EMAIL
------------------------- -------------------- ------------------------------
FOUNTAIN,VICKIE L                    
BALDA GALO               (512) 254-1694

2 rows selected.

Note how the first email matches the pattern even thought I’m not defining a range for capital letters. This is happening because I used the ‘i’ parameter that tells the function to perform a case insensitive search.

Now lets find the records that don’t match the pattern:

SQL> select *
  2 from clients
  3 where not regexp_like (email,'^[a-z0-9._-]+@[a-z0-9.-]+\.[a-z]{2,3}$','i');

NAME_CLIENT               PHONE                EMAIL
------------------------- -------------------- ------------------------------
ORTIZ,GENOVEVA M                     

2 rows selected.

It’s obvious that the first email doesn’t match the pattern because the @ is missing. The second email is incorrect because I specified that the top level domain is only valid if it’s a sequence of 2 or 3 letters at the most and store has four.

So, this is an example showing very basic validations on email addresses. There’s room for improvement but it will always depend on your use case and how complex you want to make it.

I hope you find it useful.