Use RegEx to extract text from an Outlook email message

Last reviewed on December 30, 2013

Getting text out of a message body isn't as hard as it sounds, provided there is something in the body to search on. For example, if you are looking for a tracking code in an email and it's always identified as "Tracking code:" you can use InStr, Len, Left or Right functions to find and extract the tracking code.

Extract codes from email messagesOr you can use Regular Expressions.

For example, to extract the UPS tracking numbers for packages sent by Amazon.com and formatted as shown in the screenshot, I need to look for the words "Carrier Tracking ID", followed by possible white space and a colon (:).

.Pattern = "(Carrier Tracking ID\s*[:]+\s*(\w*)\s*)"

This returns the next alphanumeric string, or in my example, 1Z2V37F8YW51233715. (There are two tracking numbers in the email message and both are returned.)

Use \s* to match an unknown number of white spaces (spaces, tabs, line feeds, etc)
Use \d* to match only digits
Use \w* to match alphanumeric characters, such as are used in UPS tracking codes.

To use this code sample, open the VBA Editor using Alt+F11. Right-click on Project1 and choose Insert > Module. Paste the following code into the module.

Set a reference to VBScript ExpressionsYou'll need to set a reference to the Microsoft VBScript Regular Expressions 5.5 library in Tools, References.

Note: if VBScript Expressions 1 is selected, deselect it. You can't use both v1 and v5.5.

Don't forget, macro security needs to be set to low during testing.

Sample macros using regex are at the following links. Use Copy to Excel code sample to copy a row of text to a row of cells in Excel and Select a name in a Word document then create a Contact to create a contact from a resume or similar file.

Sub GetValueUsingRegEx()
 ' Set reference to VB Script library
 ' Microsoft VBScript Regular Expressions 5.5
 
    Dim olMail As Outlook.MailItem
    Dim Reg1 As RegExp
    Dim M1 As MatchCollection
    Dim M As Match
        
    Set olMail = Application.ActiveExplorer().Selection(1)
   ' Debug.Print olMail.Body
    
    Set Reg1 = New RegExp
    
    ' \s* = invisible spaces
    ' \d* = match digits
    ' \w* = match alphanumeric
    
    With Reg1
        .Pattern = "Carrier Tracking ID\s*[:]+\s*(\w*)\s*"
        .Global = True
    End With
    If Reg1.test(olMail.Body) Then
    
        Set M1 = Reg1.Execute(olMail.Body)
        For Each M In M1
            ' M.SubMatches(1) is the (\w*) in the pattern
            ' use M.SubMatches(2) for the second one if you have two (\w*)
            Debug.Print M.SubMatches(1)
            
        Next
    End If
    
End Sub

If we look for just the colon (.Pattern ="([:]+\s*(\w*)\s*)" ), we get just the first word in the results:

UPS
May
Standard
1Z2V37F8YW51233715
Diane

This is because (\w*) tells the code to get the next alphanumeric string, not the entire line, and strings do not include spaces.

Get two (or more) values from a message

If you need to use 2 or more patterns, you can repeat the With Reg1 through End if for each pattern or you can use Case statements.

This sample code looks for 3 patterns, creates a string and adds it to the subject field of a message.

Each case represents a different pattern. In this sample, we want just the first occurrence of each pattern; .Global = False instructs the code to stop when it finds the first match.

The data we are looking for is formatted like this:

Order ID : VBNSA-123456
Order Date: 09 AUG 2013
Total $54.65

\n at the end of the pattern matches a line break, and strSubject = Replace(strSubject, Chr(13), "") cleans any line breaks from the string.

Sub GetValueUsingRegEx()
    Dim olMail As Outlook.MailItem
    Dim Reg1 As RegExp
    Dim M1 As MatchCollection
    Dim M As Match
    Dim strSubject As String
    Dim testSubject As String
         
    Set olMail = Application.ActiveExplorer().Selection(1)
     
    Set Reg1 = New RegExp
    
For i = 1 To 3

With Reg1
    Select Case i
    Case 1
        .Pattern = "(Order ID\s[:]([\w-\s]*)\s*)\n"
        .Global = False
        
    Case 2
       .Pattern = "(Date[:]([\w-\s]*)\s*)\n"
       .Global = False
       
    Case 3
        .Pattern = "(([\d]*\.[\d]*))\s*\n"
        .Global = False
    End Select
    
End With
    
    
    If Reg1.test(olMail.Body) Then
     
        Set M1 = Reg1.Execute(olMail.Body)
        For Each M In M1
            Debug.Print M.SubMatches(1)
            strSubject = M.SubMatches(1)
            
         strSubject = Replace(strSubject, Chr(13), "")
         testSubject = testSubject & "; " & Trim(strSubject)
         Debug.Print i & testSubject
         
         Next
    End If
          
Next i

Debug.Print olMail.Subject & testSubject
olMail.Subject = olMail.Subject & testSubject
olMail.Save

Set Reg1 = Nothing
     
End Sub

Written by

Diane Poremsky
A Microsoft Outlook Most Valuable Professional (MVP) since 1999, Diane is the author of several books, including Outlook 2013 Absolute Beginners Book. She also created video training CDs and online training classes for Microsoft Outlook. You can find her helping people online in Outlook Forums as well as in the Microsoft Answers and TechNet forums.

If the Post Coment button disappears, press your Tab key.