Use RegEx to extract text from an Outlook email message

Last reviewed on September 21, 2014   —  47 comments

Getting text out of a message body isn't as hard as it sounds, provided there is something in the body to search on. For example, if you are looking for a tracking code in an email and it's always identified as "Tracking code:" you can use InStr, Len, Left or Right functions to find and extract the tracking code.

Extract codes from email messagesOr you can use Regular Expressions.

For example, to extract the UPS tracking numbers for packages sent by Amazon.com and formatted as shown in the screenshot, I need to look for the words "Carrier Tracking ID", followed by possible white space and a colon (:).

.Pattern = "(Carrier Tracking ID\s*[:]+\s*(\w*)\s*)"

This returns the next alphanumeric string, or in my example, 1Z2V37F8YW51233715. (There are two tracking numbers in the email message and both are returned.)

Use \s* to match an unknown number of white spaces (spaces, tabs, line feeds, etc)
Use \d* to match only digits
Use \w* to match alphanumeric characters, such as are used in UPS tracking codes.

To use this code sample, open the VBA Editor using Alt+F11. Right-click on Project1 and choose Insert > Module. Paste the following code into the module.

Set a reference to VBScript ExpressionsYou'll need to set a reference to the Microsoft VBScript Regular Expressions 5.5 library in Tools, References.

Note: if VBScript Expressions 1 is selected, deselect it. You can't use both v1 and v5.5.

Don't forget, macro security needs to be set to low during testing.

Sample macros using regex are at the following links. Use Copy to Excel code sample to copy a row of text to a row of cells in Excel and Select a name in a Word document then create a Contact to create a contact from a resume or similar file.

Sub GetValueUsingRegEx()
 ' Set reference to VB Script library
 ' Microsoft VBScript Regular Expressions 5.5
 
    Dim olMail As Outlook.MailItem
    Dim Reg1 As RegExp
    Dim M1 As MatchCollection
    Dim M As Match
        
    Set olMail = Application.ActiveExplorer().Selection(1)
   ' Debug.Print olMail.Body
    
    Set Reg1 = New RegExp
    
    ' \s* = invisible spaces
    ' \d* = match digits
    ' \w* = match alphanumeric
    
    With Reg1
        .Pattern = "Carrier Tracking ID\s*[:]+\s*(\w*)\s*"
        .Global = True
    End With
    If Reg1.test(olMail.Body) Then
    
        Set M1 = Reg1.Execute(olMail.Body)
        For Each M In M1
            ' M.SubMatches(1) is the (\w*) in the pattern
            ' use M.SubMatches(2) for the second one if you have two (\w*)
            Debug.Print M.SubMatches(1)
            
        Next
    End If
    
End Sub

If we look for just the colon (.Pattern ="([:]+\s*(\w*)\s*)" ), we get just the first word in the results:

UPS
May
Standard
1Z2V37F8YW51233715
Diane

This is because (\w*) tells the code to get the next alphanumeric string, not the entire line, and strings do not include spaces.

Get two (or more) values from a message

If you need to use 2 or more patterns, you can repeat the With Reg1 through End if for each pattern or you can use Case statements.

This sample code looks for 3 patterns, creates a string and adds it to the subject field of a message.

Each case represents a different pattern. In this sample, we want just the first occurrence of each pattern; .Global = False instructs the code to stop when it finds the first match.

The data we are looking for is formatted like this:

Order ID : VBNSA-123456
Order Date: 09 AUG 2013
Total $54.65

\n at the end of the pattern matches a line break, and strSubject = Replace(strSubject, Chr(13), "") cleans any line breaks from the string.

Sub GetValueUsingRegEx()
    Dim olMail As Outlook.MailItem
    Dim Reg1 As RegExp
    Dim M1 As MatchCollection
    Dim M As Match
    Dim strSubject As String
    Dim testSubject As String
         
    Set olMail = Application.ActiveExplorer().Selection(1)
     
    Set Reg1 = New RegExp
    
For i = 1 To 3

With Reg1
    Select Case i
    Case 1
        .Pattern = "(Order ID\s[:]([\w-\s]*)\s*)\n"
        .Global = False
        
    Case 2
       .Pattern = "(Date[:]([\w-\s]*)\s*)\n"
       .Global = False
       
    Case 3
        .Pattern = "(([\d]*\.[\d]*))\s*\n"
        .Global = False
    End Select
    
End With
    
    
    If Reg1.test(olMail.Body) Then
     
        Set M1 = Reg1.Execute(olMail.Body)
        For Each M In M1
            Debug.Print M.SubMatches(1)
            strSubject = M.SubMatches(1)
            
         strSubject = Replace(strSubject, Chr(13), "")
         testSubject = testSubject & "; " & Trim(strSubject)
         Debug.Print i & testSubject
         
         Next
    End If
          
Next i

Debug.Print olMail.Subject & testSubject
olMail.Subject = olMail.Subject & testSubject
olMail.Save

Set Reg1 = Nothing
     
End Sub

Written by

Diane Poremsky
A Microsoft Outlook Most Valuable Professional (MVP) since 1999, Diane is the author of several books, including Outlook 2013 Absolute Beginners Book. She also created video training CDs and online training classes for Microsoft Outlook. You can find her helping people online in Outlook Forums as well as in the Microsoft Answers and TechNet forums.

Please post long or more complicated questions at Outlookforums.

47 responses to “Use RegEx to extract text from an Outlook email message”

  1. carrolltd47

    Diane, this is a great tip, however, can you tell me how to point to a specific folder in Outlook and where the output file is created?

    Thanks,

    Tim

  2. carrolltd47

    Wanted to output to a text file.

  3. Corby Hicks

    Hi. Thanks for your articles! They have been very helpful. I am wondering if there is a way to capture multiple pieces of information and copy them to the subject. For example, on receipts I receive regularly, I need to extract three pieces of info:
    Order ID : VBNSA-XXXXXX
    Order Date: 09 AUG 2013
    Total $XX.XX

    Is it possible to find these and copy into the subject field?

  4. Corby Hicks

    Thanks, Diane! Really helpful! I played around with it this morning a bit but haven't had much success. I'm getting the following result "Order VBNSA-025230; .; ." I just realized the Date string is longer than I wrote previously. It actually appears as "09 AUG 2013 14:00:05"... I don't need / want the time. I'm assuming that would be the reason for the date not appearing in the results?

    THANKS AGAIN!

  5. verdantly

    Hi Diane -

    Thanks again for the good info. I wonder how you would use something similar to the above to find a number in an email when it's label is on the line before or even two lines before. For example, something like this would appear in the body of the email along with items and a total:

    PO#
    --------
    123

  6. Richard Sutter

    I have a similar situation, where I have large number of form responses from which I need to extract the same types of info and rearrange them into a set pattern. Typically I receive a message like this:
    ---

    SLA:
    Date Entered: 03-OCT-2013 13:43 (GMT -7) Mountain Time.
    External response time (SLA): 03-OCT-2013 22:43.(GMT -7) Mountain Time.

    SR Description: I would like to start troubleshooting to find out why the analyzer reports etc. etc. etc.
    Company: CUSTOMER COMPANY NAME.
    Top Account: Non GAP.
    Customer Sub Region: West Coast.

    Site:
    Site Name: CUSTOMERSITEALPHA#280501.
    Address: 1234 ACADEMY BLVD,POD A, 1ST FLR, LOC# 1007,.
    City: COLORADO SPRINGS.
    State: Colorado.
    Country: United States.
    Division Name: S&R.

    Opened Via: WEB.
    Contact: John Smith.
    Contact's EMAIL: john.smith@customername.com.

    Contract Number: 12345.
    Service Level: Gold Lite
    Serial #: ABC-12345.
    Service Provider: .

    Priority: .
    Severity: Low.

    SR Status: Open.
    Group Assigned: T1 Americas.
    SR Owner: .
    Owner Mail: .
    Owner Phone: .
    SR Type: Malfunction.

    SR Categorization:
    Solution division: S&R.
    Solution name: ABCD Product.
    Solution version: ABCD Product 4.1.
    Solution UP: 25.
    ---

    And I need to extract info and rearrange it into this:

    External response time (SLA): 03-OCT-2013 22:43.(GMT -7) Mountain Time.
    Serial #: ABC-12345.
    Solution name: ABCD Product.
    Solution version: ABCD Product 4.1.
    Solution UP: 25.
    SR Description: I would like to start troubleshooting to find out why the analyzer reports etc. etc. etc.
    Contact: John Smith.
    john.smith@customername.com.

    I've tried several ways to get this info in this order into a text file or a msg box and just keep failing at the find step.... Any suggestions ?

  7. Nathan King

    Hi
    I receive regular emails from a specific sender with an attachment with a generic name. I would like to parse the string in the message and use the string to save my attachment. For example, the attachment is named generic.bmp, and the string I would like to extract from the message is
    Dear user-abc@domain:
    I would like to extract user-abc@domain (without the colon at the end) to save the attachment as user-abc@domain.bmp
    How do I incorporate this script with the save attachment script?
    BTW, this is for Outlook 2010
    Thanks

  8. terence

    Input text at email :

    BAML (BANK OF XXX)
    Total Amount: 1000 Total Price: 22

    CSFB (CREDIT XXX HK)
    Total Amount: 11000 Total Price: 2.2

    How can i achieve

    BAML - 1000
    CSFB - 11000

  9. anired

    Hi,

    How to extract the latest mail body from a chain of mails,(i.e the topmost mail body txt from a mail chain) .thanks much

  10. anired

    Thanks Diane for the reply,actually what am trying to implement is to write a pattern which takes in single mails/chain of mails as input and return the latest mail body txt as output.Yeah even i have tried similar patterns like you have mentioned From separator or signatures like
    thanks,regards but i found these patterns are not fully solving my purpose ,i was wondering is thr any way like outlook separates the mails with a line when we look at mails in outlook client ,so that i can just return the part before the first line separator.any pointers will be a great help .Thanks much.

    PS:the mails are retrieved from exchange server using EWS services with "bodytype.text" as thr format .

  11. zack

    Can you please help. I have very limited knowledge of VB and I have 2007 Outlook running on windows 7 PC. I want to acheive the following:

    1 - I get the generic email from a marketing source. and i need to extract the following information in the message body.
    - Item number:
    - Item title
    - Name
    - email
    - phone number

    2 - I need to use outlook to auto reply to to the email from the message with BCC to another person(fixed) the message and use a template for that item number. I normally use the signature to reply to message. I have created several signatures with a predefined message to each item number.
    3 - i would like to use the -item title as the subject.

    that will make my life very easy and instead of sitting and replying email with a predefined signatures, everything will automated. Your help will be greatly appreciated.

  12. Kevin

    Hi Diane,

    I have a similar situation where I need to create a task from an newmail. Here is the code used for that use:

    Sub ConvertMailToTask(Item As Outlook.MailItem)
    Sub ConvertMailToTask(Item As Outlook.MailItem)
    Dim objTask As Outlook.TaskItem
    Set objTask = Application.CreateItem(olTaskItem)
    With objTask
    .Subject = Item.Subject
    .StartDate = Item.ReceivedTime
    .Body = Item.Body
    .DueDate = (This is the information I am looking for)
    .Save
    End With
    Set objTask = Nothing
    End Sub

    This is activated through a rule running the script. How can I extract the duedate from the body of that mail and apply the value to the previous script?

    Here is the format of the text in the body: Due Date : 2013-12-06

    Thank you!

  13. Frans

    Hi Diane,

    I have a situation where i receive an email from an internal system with pdf document attached. I then need to forward it to the email mentioned after Email Address: I do this by using your suggested code and it works fine:

    With Reg1
    .Pattern = "(Email Address:\s*(\d*)\s*)"
    .Global = True
    End With

    If Reg1.test(Item.Body) Then
    Set M1 = Reg1.Execute(Item.Body)
    For Each M In M1
    strCode = M.SubMatches(1)
    'MsgBox "reg1: " & strCode
    Next
    End If

    With Reg2
    .Pattern = "([a-z0-9.]*)@([a-z0-9.]*)"
    .Global = True
    End With

    See below what the body of the email looks like.

    Please Find attached report for patient: MRS L HOWSON.
    Email Address: fttest@emaildomain.com
    Reference number: D0630422

    I then need to extract the patient name and the reference number and use that in the new Subject but i dont get it to work. Can you maybe lend a helping hand please.

    Many thanks

  14. Alex

    Hi Diane,

    Run into a problem with capturing a date and a decimal:

    At: 4/04 12:20:17
    Last: 105.9805

    Cannot figure out how to do the slash. Dicimal does not work either.

    I tried

    .Pattern = "(At\s*[:]\s*(\w*)\s*)"

    but it captures only the 4, not the rest. Played with several ways but nothing works.

    With the decimal tried

    .Pattern = "(Last\s*[:]\s*(\d*)\s*)"

    but again it only captures 105, not the rest.

    Could you probably give me some tips how to make it work?

    Many thanks in advance,
    Alex

  15. Alex

    Thanks a lot, Diane. It did not work for some reason but at least pointed me in the right direction. This is what worked for me
    .Pattern = "(Last\s*[:]\s*(\d+\.\d+)\s*)"

    .Pattern = "(At\s*[:]\s*(\d*[-/.]\d*)\s*)"

  16. David

    I was wondering if there is a way to save emails as names from data you parse from inside the email. We have a very generic email that is sent from the same address for user management. I would like to be able to parse each email for the users nt id and in turn save each email as a text file as the nt id. Is this possible? I have scripts that are currently working that save each one with a generic name from a shared folder.

  17. sun

    HI Diane,

    I have a requirement like identifying a particular tag in xml and replace it with default text

    ex- 123

    i need to replace 123 with xxx.

    Can we achieve it using regex? Can you please advice?

    Thanks
    Sun

  18. Ramaswamy Ramalingam

    Hi Diane,

    Thanks for the wonderful article.
    Is it possible that we extract the details from the message and automatically create and save a contact in outlook.

    Thanks,
    Ramaswamy

  19. Dave Upston

    Hi Diane, this article has helped me learn a lot, thanks for putting it together.

    I'm trying to play around with the "Get two (or more) values from a message".

    I've got emails coming in with content like:

    Fax received from TSID: 0297346529
    Line CallerID: 297346529
    Date and Time Received: Mon, 29 Sep 2014 12:42:09 EST
    Number of Pages: 3
    Send to: T1100555
    Connect Time: 1:12
    Date and Time Processed:
    Telstra Job Number: 1718432594

    My perfect end result, would be changing the subject on arrival to:

    Fax from CustomerName - 3 Pages

    Where CustomerName is determined by the number in Line CallerID

    I have a full list of CallerID = CustomerName that can be referenced (somewhere)

    how possible is this?

  20. Andrew

    Hello Diane,

    I'm trying to use this code to pull from a selection of Excel cells, and move the acquired data into cells on a new sheet.

    For example, all of the data I want to parse is located in column C on Sheet 1. I want to search for two things, "Name:" and "Position:" and put the values that come after the colon into Cells D2 and D3 respectively on Sheet2.

    How would you advise I do this?

    Thanks!

  21. Ryan

    Diane,
    Your post is amazing and if I can get this to work for a project I'm involved with, it can save me many hours of manual entry.

    I have never used VBA before, but I'm not afraid to try. I receive +200 emails that are formatted exactly the same as below example. I want to parse the info that follows Vehicle, Event, Time and Coordinates. I would assume the same logic would be used as in your code using the colon, but what would this do for the date/time field which contains a colon.

    I really don't know where to begin, but I would appreciate any help! Thank you.

    Actual Email:

    GEOSERVICES EVENT NOTIFICATION

    EQUIPMENT
    Vehicle: 11848

    EVENT
    Event: Departure
    Time: 10/22/2014 16:00 EDT
    Landmark Type: Port
    Landmark: Georgia Ports Authority
    Address: 2 Main St
    Garden City, GA 31408

    EQUIPMENT POSITION @ EVENT
    Coordinates: 32.126296,-81.158426
    Nearest City: 5.70 miles NW of Savannah, GA Nearest Town: 1.32 miles N of Garden City, GA

Leave a Reply

If the Post Coment button disappears, press your Tab key.