There are many reasons why you may want to search through mailboxes that
are present in your Exchange database. Most of those reasons tend to
cluster around a single major area: litigation (or fear of same). That
is, you (or your company) are involved in a lawsuit and need to mine
your Exchange mailboxes as if they were a database; or your company is
afraid that an action of one or more employees may lead to a lawsuit and
need to find proof, one way or the other.
In Exchange Server 2003, search was a very expensive process in terms of
server resources. Generating a full-text index to search against could
utilize very large amount of processor time and disk resources.
Therefore, it was turned off by default and had to be explicitly
enabled.
In Exchange Server 2007, the search engine was completely rearchitected
and reimplemented. Now, the indexing engine consumes relatively little
processor or I/O resource, and it is enabled by default. The question
is: does it help you?
If your users are using Outlook 2007 in online mode, using Outlook Web
Access, or using a device that supports Exchange ActiveSync, the answer
is yes! When executing a search against a user’s mailbox, the server
based search is used, and this can provide absolutely stellar
performance.
If your users are using Outlook 2007 in cached mode (or any earlier
version of Outlook), the answer is no. When executing a search against a
user’s mailbox, the in-built capabilities of Outlook are used to search
the mailbox. In Outlook 2007 service pack 2, this can be quite speedy
and provide acceptable performance. Outlook 2007 service pack 1 wasn’t
horrible (as long as you updated your workstation to Windows Search
4.0). However, earlier versions of Outlook had very poor search
performance.
That led to the development of a number of desktop search engines; for
example Google Desktop Search, LookOut (which was purchased by
Microsoft), and Copernic Desktop Search. These tools (and many others)
created a separate search capability for Outlook.
However, searching mailboxes one at a time is very slow and prone to
error.
In releases of Exchange prior to Exchange 2007, Microsoft provided a
tool named ExMerge that had the capability of searching through multiple
mailboxes for (somewhat) arbitrary content. As of Exchange 2007,
Microsoft deprecated ExMerge and replaced it with a PowerShell cmdlet
named Export-Mailbox. Note that there is nothing which prevents ExMerge
from running against an Exchange 2007 database, and many companies still
use the tool. However, it isn’t supported.
Export-Mailbox (and its partner, Import-Mailbox) has had many updates
during the various service packs and Update Releases of Exchange 2007.
As of Service Pack 1 Update Release 5, it is a fairly usable tool. Prior
to that, it had a number of, shall we say, idiosyncrasies.
A primary difference between ExMerge and Export-Mailbox is how they each
handle the dumpster (that is, items which have been deleted from a
mailbox and are held in “deleted item recovery” awaiting their final
purge). Those differences rate an entire article all on their own; just
be aware that they are different.
Both provide mechanisms for exporting single or multiple mailboxes,
mailboxes which have specific content in their subject header or message
body, deleting attachments which have a certain name, etc.
However, their searching capability is very weak. Other than searching
on a Subject, the recipients (To, CC, and BCC headers), the sender
(Sender and From headers), and the Date field; you have no additional
capability for examining message meta-data. Both are very resource
intensive and will scan the entire mailbox that you specify for any
criteria (that is, the searches aren’t very smart).
For better search and discovery tools, you should investigate third
party providers.
ExMerge has another major disadvantage - it can only export to ANSI
PSTs, which are limited to about 2 GB in size. Export-Mailbox can use
Unicode PSTs, for effectively unlimited PST sizes.