Add a new Domain into Microsoft Exchange Server 2007

Microsoft Exchange Server is the server side of a client-server infrastructure. It provides the database and management tools for enterprise-level email, instant messaging, video conferencing, calendaring, appointment setting and contact management.

It also offers support for mobile email as well as partner companies utilizing company email services. Microsoft Exchange Server can also handle multiple domains; adding new ones requires configuring the settings.

Here are those 8 steps to add a new domain :

  1. Launch the “Exchange Management Console.”
  2. Click “Organization Transportation” and “Hub Transport.”
  3. Click “Accepted Domains,” “Actions” and “New Accepted Domain.”
  4. Type a “Name” and “Accepted Domain” for the server. For example “newdomain.com.”
  5. Check the radio button next to “Authoritative.” Click “New” and “Finish.”
  6. Click “New E-Mail Address Policy.” Type in a name for the policy. Check the radio button next to “All Recipient Types.” Click “Next.”
  7. Click “Add.” Check the box next to “E-Mail Address Local Part.” Click “Use Alias.”
  8. Check the radio button next to “Select Accepted Domain.” Click “Browse.” Click the domain you just created and click “OK” two times. Click “Next.” Enter a time frame when you want the policy to be started. Click “Next” when done.

 

Deleting mails/meetings in a mailbox

When you have an application mailbox that automatically send meeting request to people, so many items could be really heavy.

Here is a little tips to delete all previous meeting automatically (due to a scheduled task) :

#Calculating J-1 date
$ddate=get-date -uformat %m/%d/%Y
$ddate=$ddate.AddDays(-1)
#Adding rights on the mailbox (do it Once)
Get-Mailbox “application.mailbox” | Add-MailboxPermission -User $localu -AccessRights Fullaccess -InheritanceType all
#Deleting old meeting/mails
Get-Mailbox “application.mailbox” | Export-Mailbox -EndDate $ddate -DeleteContent

That could be really… really usefull !

Erase all previous process

When you create a scheduled task, sometimes and old task is still running or already launched even if the task is stopped. To stop all previous process, I created a little function 🙂 :

function stopprocess($processname){

$wpids=$null

$pids=$null

$a=(Get-WmiObject -class win32_process | where{$_.ProcessName -eq $processname})

$a|foreach-object{

$processid=$_.processid

$process=(Get-WmiObject -class win32_process | where{$_.ProcessID -eq $processid}).getowner()

$user=$process.user

If($user -eq “svc-one-mpmadmin”){

If($pids -eq $null){

$pids=$processid

}

If($processid -gt $pids){

$pids=$processid

}

$wpids+=”/” + $processid

}

}

$wpidslength=$wpids.length – 1

$wpids=$wpids.substring(1,$wpidslength)

$wpids.split(“/”)|

where{$_ -ne $pids}|

foreach-object{

Stop-Process -id $_

}

}

To use that function, just write :

stopprocess “ping.exe”

This function will stop all process based on the ProcessID. Only the latest PID will left.

Good Permission for Microsoft Exchange 2007

After installing and configuring the Good Server application you will need to run the following Exchange 2007 PowerShell commands to grant the Good service account (goodadmin) access to the Exchange mailboxes.

Granting access

  1. Launch Exchange Management Shell. (Do not use Windows
    Powershell.)
  2. At the [MSH] prompt, enter:
    [MSH] C:>Get-OrganizationConfig | Add-AdPermission -user GoodAdminName -AccessRights GenericRead -ExtendedRights “Read metabase properties”,”Create named properties in the information store”,”View information store status”,”Administer information store”,”Receive as”,”Send as” Pre-Installation (Exchange 2010)
  3. Make sure that the GoodAdmin account is a member of Domain
    Users only.
  4. To display the permissions you have set, you can enter the
    following command:
    C:>Get-OrganizationConfig | Get-AdPermission -user GoodAdminName | fl

(source : http://www.good.com/documentation6/QuickInstall_exchange.pdf)

Blackberry permissions for Exchange 2007

After installing and configuring the Blackberry Enterprise Server application you will need to run the following Exchange 2007 PowerShell commands to grant the Blackberry service account (besadmin) access to the Exchange mailboxes.

Granting access for a single mailbox

In this example we are giving the besadmin account the access required to handle the email for the john.doe@domain.com mailbox-

[PS] C: >Add-MailboxPermission john.doe@domain.com –user domainbesadmin –AccessRights FullAccess [PS] C: >Add-ADPermission john.doe@domain.com –user domainbesadmin -ExtendedRights Send-As, Receive-As

Granting access to all mailboxes

This command will allow the besadmin account to access ALL mailboxes on the Exchange server.

[PS] C: >Get-MailboxServer | Add-ADPermission -User domainbesadmin -AccessRights GenericRead, GenericWrite -ExtendedRights Send-As, Receive-As, ms-Exch-Store-Admin

 

Migrating to Microsoft Exchange 2010

Since the launch of Microsoft Exchange 2010, organizations looking to update their infrastructure with more energy and cost-efficient servers are rapidly adopting the messaging platform, making it the dominant selection in the messaging and collaboration marketplace.

There are many benefits that come with adopting the latest version of Microsoft Exchange 2010, including a streamlined installation process, excellent online resources, enhanced information security and improved compliance features. What’s more, the new servers are providing organizations with more efficient hardware in terms of cost, energy and process.

Upgrading from legacy systems is presenting a number of challenges for IT personnel, though. The continued growth of email and email-associated items can be a major migration hurdle, particularly for those that operate globally. The situation may be further hampered in the current economic climate where downsizing or consolidation present complicated data integration challenges, set against the backdrop of significant budget constraints and an overworked, under-resourced IT staff.

 

Whether upgrading from Microsoft Exchange 2007, the older 2003 version, or moving over from a non-exchange platform, IT personnel managing the migration have a number of security and compatibility issues to address if they are to ensure a seamless and efficient transition:

Migration Scenarios

The simplest scenario involves a transition from Exchange 2007 to Exchange 2010 where there is no consolidation and all mailboxes are within a single Active Directoryforest. In the vast majority of cases, this can be handled by the tools Microsoft provides as part of Exchange.

At the more complex end of the scale, migration from a non-Microsoft Exchange platform will almost certainly require third-party tools. This is also likely to be the case if you are migrating from Exchange 5.5, Exchange 2000 or Exchange 2003 — none of which have a direct migration path to Exchange 2010. In these instances, Microsoft’s recommended method would be to first transition to Exchange 2007 and then transition again to Exchange 2010.

The final scenario is one of consolidation or total renewal. Data often needs to be moved across WAN links and mailboxes are potentially in multiple Active Directory forests. This situation can occur for a multitude of reasons: downsizing due to economic hardship; mergers and acquisitions; centralization into regional data centers; moving to a new, clean Active Directory forest either as part of a hosted or managed service; or as the basis for a rollout of a new wave of technology. In these cases, it is hard to generalize about the tools needed for migration; however, it is often true that third-party tools can save both time and money.

For each scenario, there are a series of best practices that organizations can follow to ensure a seamless transition:

Using Microsoft Tools

Intra-organizational transitions from Exchange 2007 follow a well-understood pattern. First, Exchange 2010 is installed on either Microsoft Windows Server 2008 or 2008 R2 64-bit edition, and often on new hardware or perhaps a virtualization platform. As part of the Exchange 2010 installation, the existing Active Directory is prepared for Exchange 2010. Once installed, Exchange 2010 is configured to take over the external Web access clients such as Outlook Web App (OWA) and mobile devices and to reroute any users still on Exchange 2007 to the relevant backend Exchange 2007 server. Mail routing is established between the old and new systems, and services such as address book generation are moved to Exchange 2010. At this point co-existence is established and data migration begins.

Data migration during the transition from Exchange 2007 is controlled from the Exchange 2010 management console (or management shell for scripting aficionados). A new feature, “Move Requests,” makes use of the new architecture in Exchange 2010, specifically the Exchange Mailbox Replication Service. This service carries out all mailbox moves from Exchange 2007, and when moving from Exchange 2007 SP2 to Exchange 2010, can also move mailboxes online. For the majority of the migration, workers can continue using Outlook as normal with only a short period of disruption right at the end of the move. However, the online mailbox move is not available for the transition from Exchange 2003.

Using Third-party Tools

Third-party tools can often provide alternative migration routes to those available with Exchange 2010 native tools. In a migration from a non-Exchange platform, an initial step is to build the Microsoft infrastructure. This will inevitably involve directory services work, to ensure that all mail users are represented in an existing or new Active Directory. At that point a new installation of Exchange 2010 can be carried out. Generally, co-existence (for example sharing calendars between systems) other than simple mail flow is a painful process and should be avoided. Microsoft no longer provides tools to migrate data from non-Exchange mail systems, so in this case reliance on third-party tools is essential.

When migrating from old Exchange versions or consolidating systems, perhaps as a result of a restructuring process or merger, there are several considerations. With an Exchange 5.5 or 2000 migration scenario, given that there is no online mailbox move facility and no direct path to Exchange 2010, instead of first transitioning to Exchange 2007 and then Exchange 2010 it is considerably easier and less time-consuming to using certain tools on the market today. Certain tools can extract data directly from the legacy Exchange system database (EDB) files and import them directly into the new Exchange 2010 system, while also creating new mailboxes on the fly if required. This would also be an appropriate method if as result of a merger, data needed to be moved from an acquired Exchange system.

In Exchange 2010 the Move Request feature supports moving mailboxes from Active Directory forests other than the local one where Exchange 2010 is installed. However, in any consolidation scenario, WAN links may be an issue. If, for example, you were consolidating several remote servers into a central location and needed to move terabytes of data over a WAN link, the process would be extremely time-consuming. In such scenarios, it would therefore be simpler to ship extracted copies of the EDBs on hard drives to the central location, and use a technology to import the data into the new centralized Exchange system.

Finally, with any of the migration scenarios described above, there is potentially a need to migrate only a select percentage of the data. This could be because you are trying to avoid the migration of redundant or end-of-life data, or because new data has been generated while performing an offline migration from a point-in-time snapshot of the source system. Having a technology solution to perform complex searches based on criteria such as date range and to migrate only the selected data is a significant advantage.

Powershell – Modules

Modules are a set of commands that can be added to Powershell as well as Snapins.

For a list of available modules, use the command :

PS C:> get-module -ListAvailable

ModuleType Name ExportedCommands
———- —- —————-
Manifest ActiveDirectory {}
Manifest ADRMS {}
Manifest AppLocker {}
Manifest BestPractices {}
Manifest BitsTransfer {}
Manifest GroupPolicy {}
Manifest PSDiagnostics {}
Manifest ServerManager {}
Manifest TroubleshootingPack {}
Manifest WebAdministration {}

Some modules are installed automatically during installation of features on the server, as is the case with Active Directory and GroupPolicy that are installed with a new Domain Controller.

Regular Expressions

Use regular expressions for more accurate pattern recognition if you require it. Regular expressions offer many more wildcard characters; for this reason, they can describe patterns in much greater detail. For the very same reason, however, regular expressions are also much more complicated.

Describing Patterns

Using the regular expression elements listed in Table 13.11, you can describe patterns with much greater precision. These elements are grouped into three categories:

  • Char: The Char represents a single character and a collection of Char objects represents a string.
  • Quantifier: Allows you to determine how often a character or a string occurs in a pattern.
  • Anchor: Allows you to determine whether a pattern is a separate word or must be at the beginning or end of a sentence.

The pattern represented by a regular expression may consist of four different character types:

  • Literal characterslike “abc” that exactly matches the “abc” string.
  • Masked or “escaped” characters with special meanings in regular expressions; when preceded by “”, they are understood as literal characters: “[test]” looks for the “[test]” string. The following characters have special meanings and for this reason must be masked if used literally: “. ^ $ * + ? { [ ] | ( )”.
  • Predefined wildcard charactersthat represent a particular character category and work like placeholders. For example, “d” represents any number from 0 to 9.
  • Custom wildcard characters: They consist of square brackets, within which the characters are specified that the wildcard represents. If you want to use any character except for the specified characters, use “^” as the first character in the square brackets. For example, the placeholder “[^f-h]” stands for all characters except for “f”, “g”, and “h”.
Element Description
. Exactly one character of any kind except for a line break (equivalent to [^n])
[^abc] All characters except for those specified in brackets
[^a-z] All characters except for those in the range specified in the brackets
[abc] One of the characters specified in brackets
[a-z] Any character in the range indicated in brackets
a Bellalarm (ASCII 7)
c Any character allowed in an XML name
cA-cZ Control+A to Control+Z, equivalent to ASCII 0 to ASCII 26
d A number (equivalent to [0-9])
D Any character except for numbers
e Escape (ASCII 9)
f Form feed (ASCII 15)
n New line
r Carriage return
s Any whitespace character like a blank character, tab, or line break
S Any character except for a blank character, tab, or line break
t Tab character
uFFFF Unicode character with the hexadecimal code FFFF. For example, the Euro symbol has the code 20AC
v Vertical tab (ASCII 11)
w Letter, digit, or underline
W Any character except for letters
xnn Particular character, where nn specifies the hexadecimal ASCII code
.* Any number of any character (including no characters at all)

Table 13.8: Placeholders for characters

Quantifiers

Every wildcard listed in Table 13.8 is represented by exactly one character. Using quantifiers, you can more precisely determine how many characters are respectively represented. For example, “d{1,3}” stands for a number occurring one to three times for a one-to-three digit number.

Element Description
* Preceding expression is not matched or matched once or several times (matches as much as possible)
*? Preceding expression is not matched or matched once or several times (matches as little as possible)
.* Any number of any character (including no characters at all)
? Preceding expression is not matched or matched once (matches as much as possible)
?? Preceding expression is not matched or matched once (matches as little as possible)
{n,} n or more matches
{n,m} Inclusive matches between n and m
{n} Exactly n matches
+ Preceding expression is matched once

Table 13.9: Quantifiers for patterns

Anchors

Anchors determine whether a pattern has to be at the beginning or ending of a string. For example, the regular expression “bd{1,3}” finds numbers only up to three digits if these turn up separately in a string. The number “123” in the string “Bart123” would not be found.

Elements Description
$ Matches at end of a string (Z is less ambiguous for multi-line texts)
A Matches at beginning of a string, including multi-line texts
b Matches on word boundary (first or last characters in words)
B Must not match on word boundary
Z Must match at end of string, including multi-line texts
^ Must match at beginning of a string (A is less ambiguous for multi-line texts)

Table 13.10: Anchor boundaries

Recognizing IP Addresses

The patterns, such as an IP address, can be much more precisely described by regular expressions than by simple wildcard characters. Usually, you would use a combination of characters and quantifiers to specify which characters may occur in a string and how often:

$ip = “10.10.10.10”
$ip -match “bd{1,3}.d{1,3}.d{1,3}.d{1,3}b”

True
$ip = “a.10.10.10”
$ip -match “bd{1,3}.d{1,3}.d{1,3}.d{1,3}b”

False
$ip = “1000.10.10.10”
$ip -match “bd{1,3}.d{1,3}.d{1,3}.d{1,3}b”

False

The pattern is described here as four numbers (char: d) between one and three digits (using the quantifier {1,3}) and anchored on word boundaries (using the anchor b), meaning that it is surrounded by white space like blank characters, tabs, or line breaks. Checking is far from perfect since it is not verified whether the numbers really do lie in the permitted number range from 0 to 255.

# There still are entries incorrectly identified as valid IP addresses:
$ip = “300.400.500.999”
$ip -match “bd{1,3}.d{1,3}.d{1,3}.d{1,3}b”

True

Validating E-Mail Addresses

If you’d like to verify whether a user has given a valid e-mail address, use the following regular expression:

$email = “test@somewhere.com”
$email -match “b[A-Z0-9._%+-]+@[A-Z0-9.-]+.[A-Z]{2,4}b”

True
$email = “.@.”
$email -match “b[A-Z0-9._%+-]+@[A-Z0-9.-]+.[A-Z]{2,4}b”

False

Whenever you look for an expression that occurs as a single “word” in text, delimit your regular expression by word boundaries (anchor: b). The regular expression will then know you’re interested only in those passages that are demarcated from the rest of the text by white space like blank characters, tabs, or line breaks.

The regular expression subsequently specifies which characters may be included in an e-mail address. Permissible characters are in square brackets and consist of “ranges” (for example, “A-Z0-9”) and single characters (such as “._%+-“). The “+” behind the square brackets is a quantifier and means that at least one of the given characters must be present. However, you can also stipulate as many more characters as you wish.

Following this is “@” and, if you like, after it a text again having the same characters as those in front of “@”. A dot (.) in the e-mail address follows. This dot is introduced with a “” character because the dot actually has a different meaning in regular expressions if it isn’t within square brackets. The backslash ensures that the regular expression understands the dot behind it literally.

After the dot is the domain identifier, which may consist solely of letters ([A-Z]). A quantifier ({2,4}) again follows the square brackets. It specifies that the domain identifier may consist of at least two and at most four of the given characters.

However, this regular expression still has one flaw. While it does verify whether a valid e-mail address is in the text somewhere, there could be another text before or after it:

$email = “Email please to test@somewhere.com and reply!”
$email -match “b[A-Z0-9._%+-]+@[A-Z0-9.-]+.[A-Z]{2,4}b”

True

Because of “b”, when your regular expression searches for a pattern somewhere in the text, it only takes into account word boundaries. If you prefer to check whether the entire text corresponds to an authentic e-mail, use the elements for sentence beginnings (anchor: “^”) and endings (anchor: “$”):instead of word boundaries.

$email -match “^[A-Z0-9._%+-]+@[A-Z0-9.-]+.[A-Z]{2,4}$”

Simultaneous Searches for Different Terms

Sometimes, search terms are ambiguous because there may be several ways to write them. You can use the “?” quantifier to mark parts of the search term as optional. In simple cases, put a “?” after an optional character. Then the character in front of “?” may, but doesn’t have to, turn up in the search term:

“color” -match “colou?r”
True
“colour” -match “colou?r”
True

The “?” character here doesn’t represent any character at all, as you might expect after using simple wildcards. For regular expressions, “?” is a quantifier and always specifies how often a character or expression in front of it may occur. In the example, therefore, “u?” ensures that the letter “u” may, but not necessarily, be in the specified location in the pattern. Other quantifiers are “*” (may also match more than one character) and “+” (must match characters at least once).

If you prefer to mark more than one character as optional, put the character in a sub-expression, which are placed in parentheses. The following example recognizes both the month designator “Nov” and “November”:

“Nov” -match “bNov(ember)?b”

True

“November” -match “bNov(ember)?b”

True

If you’d rather use several alternative search terms, use the OR character “|”:

“Bob and Ted” -match “Alice|Bob”

True

And if you want to mix alternative search terms with fixed text, use sub-expressions again:

# finds “and Bob”:
“Peter and Bob” -match “and (Bob|Willy)”

True

# does not find “and Bob”:
“Bob and Peter” -match “and (Bob|Willy)”

False

Case Sensitivity

In keeping with customary PowerShell practice, the -match operator is case insensitive. Use the operator -cmatch as alternative if you’d prefer case sensitivity.:

# -match is case insensitive:
“hello” -match “heLLO”

True
# -cmatch is case sensitive:
“hello” -cmatch “heLLO”

False

If you want case sensitivity in only some pattern segments, use -match. Also, specify in your regular expression which text segments are case sensitive and which are insensitive. Anything following the “(?i)” construct is case insensitive. Conversely, anything following “(?-i)” is case sensitive. This explains why the word “test” in the below example is recognized only if its last two characters are lowercase, while case sensitivity has no importance for the first two characters:

“TEst” -match “(?i)te(?-i)st”

True
“TEST” -match “(?i)te(?-i)st”

False

If you use a .NET framework RegEx object instead of -match, the RegEx object will automatically sense shifts between uppercase and lowercase, behaving like -cmatch. If you prefer case insensitivity, either use the above construct to specify an option in your regular expression or avail yourself of “IgnoreCase” to tell the RegEx object your preference:

[regex]::matches(“test”, “TEST”, “IgnoreCase”)

Element Description Category
(xyz) Sub-expression
| Alternation construct Selection
When followed by a character, the character is not recognized as a formatting character but as a literal character Escape
x? Changes the x quantifier into a “lazy” quantifier Option
(?xyz) Activates of deactivates special modes, among others, case sensitivity Option
x+ Turns the x quantifier into a “greedy” quantifier Option
?: Does not backtrack Reference
?<name> Specifies name for back references Reference

Table 13.11: Regular expression elements

Of course, a regular expression can perform any number of detailed checks, such as verifying whether numbers in an IP address lie within the permissible range from 0 to 255. The problem is that this makes regular expressions long and hard to understand. Fortunately, you generally won’t need to invest much time in learning complex regular expressions like the ones coming up. It’s enough to know which regular expression to use for a particular pattern. Regular expressions for nearly all standard patterns can be downloaded from the Internet. In the following example, we’ll look more closely at a complex regular expression that evidently is entirely made up of the conventional elements listed in Table 13.11:

$ip = “300.400.500.999”
$ip -match “b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).)” + `
“{3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)b”

False

The expression validates only expressions running into word boundaries (the anchor is b). The following sub-expression defines every single number:

(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)

The construct ?: is optional and enhances speed. After it come three alternatively permitted number formats separated by the alternation construct “|”. 25[0-5] is a number from 250 through 2552[0-4][0-9] is a number from200 through 249. Finally, [01]?[0-9][0-9]? is a number from 0-9 or 00-99 or 100-199. The quantifier “?” ensures that the preceding pattern must be included. The result is that the sub-expression describes numbers from 0 through 255. An IP address consists of four such numbers. A dot always follows the first three numbers. For this reason, the following expression includes a definition of the number:

(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).){3}

A dot, (.), is appended to the number. This construct is supposed to be present three times ({3}). When the fourth number is also appended, the regular expression is complete. You have learned to create sub-expressions (by using parentheses) and how to iterate sub-expressions (by indicating the number of iterations in braces after the sub-expression), so you should now be able to shorten the first used IP address regular expression:

$ip = “10.10.10.10”
$ip -match “bd{1,3}.d{1,3}.d{1,3}.d{1,3}b”

True

$ip -match “b(?:d{1,3}.){3}d{1,3}b”

True

Finding Information in Text

Regular expressions can recognize patterns. They can also filter out data corresponding to certain patterns from text. As such, regular expressions are excellent tools for parsing raw data. For example, use the same regular expression as the one above to identify e-mail addresses if you want to extract an e-mail address from a letter. Afterwards, look in the $matchesvariable to see which results were returned. The $matches variable is created automatically when you use the -matchoperator (or one of its siblings, like –cmatch).

$matches is a hash table (Chapter 4), so you can either output the entire hash table or access single elements in it by using their names, which you must specify in square brackets:

$rawtext = “If it interests you, my e-mail address is tobias@powershell.com.”

# Simple pattern recognition:
$rawtext -match “b[A-Z0-9._%+-]+@[A-Z0-9.-]+.[A-Z]{2,4}b”

True
# Reading data matching the pattern from raw text:
$matches

Name                           Value
—-                           —–
0                              tobias@powershell.com

$matches[0]

tobias@powershell.com

Does that also work for more than one e-mail addresses in text? Unfortunately, it doesn’t do so right away. The -matchoperator looks only for the first matching expression. So, if you want to find more than one occurrence of a pattern in raw text, you have to switch over to the RegEx object underlying the -match operator and use it directly.

In one essential respect, the RegEx object behaves unlike the -match operator. Case sensitivity is the default for the RegEx object, but not for -match. For this reason, you must put the “(?i)” option in front of the regular expression to eliminate confusion, making sure the expression is evaluated without taking case sensitivity into account.

# A raw text contains several e-mail addresses. -match finds the first one only:
$rawtext = “test@test.com sent an e-mail that was forwarded to spam@muell.de.”
$rawtext -match “b[A-Z0-9._%+-]+@[A-Z0-9.-]+.[A-Z]{2,4}b”

True
$matches

Name                           Value
—-                           —–
0                              test@test.com

# A RegEx object can find any pattern but is case sensitive by default:
$regex = [regex]”(?i)b[A-Z0-9._%+-]+@[A-Z0-9.-]+.[A-Z]{2,4}b”
$regex.Matches($rawtext)

Groups   : {test@test.com}
Success  : True
Captures : {test@test.com}
Index    : 4
Length   : 13
Value    : test@test.com

Groups   : {spam@muell.de}
Success  : True
Captures : {spam@muell.de}
Index    : 42
Length   : 13
Value    : spam@muell.de

# Limit result to e-mail addresses:
$regex.Matches($rawtext) | Select-Object -Property Value

Value
—–
test@test.com
spam@muell.de

# Continue processing e-mail addresses:
$regex.Matches($rawtext) | ForEach-Object { “found: $($_.Value)” }

found: test@test.com
found: spam@muell.de

Searching for Several Keywords

You can use the alternation construct “|” to search for a group of keywords, and then find out which keyword was actually found in the string:

“Set a=1” -match “Get|GetValue|Set|SetValue”

True

$matches

Name                           Value
—-                           —–
0                              Set

$matches tells you which keyword actually occurs in the string. But note the order of keywords in your regular expression—it’s crucial because the first matching keyword is the one selected. In this example, the result would be incorrect:

“SetValue a=1” -match “Get|GetValue|Set|SetValue”

True

$matches[0]

Set

Either change the order of keywords so that longer keywords are checked before shorter ones …:

“SetValue a=1” -match “GetValue|Get|SetValue|Set”

True

$matches[0]

SetValue

… or make sure that your regular expression is precisely formulated, and remember that you’re actually searching for single words. Insert word boundaries into your regular expression so that sequential order no longer plays a role:

“SetValue a=1” -match “b(Get|GetValue|Set|SetValue)b”

True

$matches[0]

SetValue

It’s true here, too, that -match finds only the first match. If your raw text has several occurrences of the keyword, use a RegExobject again:

$regex = [regex]”b(Get|GetValue|Set|SetValue)b”
$regex.Matches(“Set a=1; GetValue a; SetValue b=12”)

Groups   : {Set, Set}
Success  : True
Captures : {Set}
Index    : 0
Length   : 3
Value    : Set

Groups   : {GetValue, GetValue}
Success  : True
Captures : {GetValue}
Index    : 9
Length   : 8
Value    : GetValue

Groups   : {SetValue, SetValue}
Success  : True
Captures : {SetValue}
Index    : 21
Length   : 8
Value    : SetValue

Forming Groups

A raw text line is often a heaping trove of useful data. You can use parentheses to collect this data in sub-expressions so that it can be evaluated separately later. The basic principle is that all the data that you want to find in a pattern should be wrapped in parentheses because $matches will return the results of these sub-expressions as independent elements. For example, if a text line contains a date first, then text, and if both are separated by tabs, you could describe the pattern like this:

# Defining pattern: two characters separated by a tab
$pattern = “(.*)t(.*)”

# Generate example line with tab character
$line = “12/01/2009`tDescription”

# Use regular expression to parse line:
$line -match $pattern

True
# Show result:
$matches

Name                           Value
—-                           —–
2                              Description
1                              12/01/2009
0                              12/01/2009    Description

$matches[1]

12/01/2009
$matches[2]

Description

When you use sub-expressions, $matches will contain the entire searched pattern in the first array element named “0”. Sub-expressions defined in parentheses follow in additional elements. To make them easier to read and understand, you can assign sub-expressions their own names and later use the names to call results. To assign names to a sub-expression, type ? in parentheses for the first statement:

# Assign subexpressions their own names:
$pattern = “(?.*)t(?.*)”

# Generate example line with tab character:
$line = “12/01/2009`tDescription”

# Use a regular expression to parse line:
$line -match $pattern

True
# Show result:
$matches

Name                    Value
—-                    —–
Text                    Description
Date                    12/01/2009
0                       12/01/2009    Description

$matches.Date

12/01/2009
$matches.Text

Description

Each result retrieved by $matches for each sub-expression naturally requires storage space. If you don’t need the results, discard them to increase the speed of your regular expression. To do so, type “?:” as the first statement in your sub-expression:

# Don’t return a result for the second subexpression:
$pattern = “(?.*)t(?:.*)”

# Generate example line with tab character:
$line = “12/01/2009`tDescription”

# Use a regular expression to parse line:
$line -match $pattern

True
# No more results will be returned for the second subexpression:
$matches

Name                   Value
—-                   —–
Date                   12/01/2009
0                      12/01/2009    Description

Further Use of Sub-Expressions

With the help of results from each sub-expression, you can create surprisingly flexible regular expressions. For example, how could you define a Web site HTML tag as a pattern? A tag always has the same structure: . This means that a pattern for one particular strictly predefined HTML tag can be found quickly:

“contents” -match “]*>(.*?)”

True

$matches[1]

Contents

The pattern begins with the fixed text “body tag, which may consist of any number of any characters (.*?). The expression, enclosed in parentheses, is a sub-expression and will be returned later as a result in$matches so that you’ll know what is inside the body tag. The concluding part of the tag follows in the form of fixed text (”

This regular expression works fine for body tags, but not for other tags. Does this mean that a regular expression has to be defined for every HTML tag? Naturally not. There’s a simpler solution. The problem is that the name of the tag in the regular expression occurs twice, once initially (“”) and once terminally (“”). If the regular expression is supposed to be able to process any tags, then it would have to be able to find out the name of the tag automatically and use it in both locations. How to accomplish that? Like this:

“Contents” -match “<([A-Z][A-Z0-9]*)[^>]*>(.*?)1>”

True

$matches

Name                           Value
—-                           —–
2                              Contents
1                              body
0                              Contents

This regular expression no longer contains a strictly predefined tag name and works for any tags matching the pattern. How does that work? The initial tag in parentheses is defined as a sub-expression, more specifically as a word that begins with a letter and that can consist of any additional alphanumeric characters.

([A-Z][A-Z0-9]*)

The name of the tag revealed here must subsequently be iterated in the terminal part. Here you’ll find “”. “1” refers to the result of the first sub-expression. The first sub-expression evaluated the tag name and so this name is used automatically for the terminal part.

The following RegEx object could directly return the contents of any HTML tag:

$regexTag = [regex]”(?i)]*>(.*?)”
$result = $regexTag.Matches(“Press here”)
$result[0].Groups[2].Value + ” is in tag ” + $result[0].Groups[1].Value

Press here is in tag button

Greedy or Lazy? Detailed or Concise Results…

Readers who have paid careful attention may wonder why the contents of the HTML tag were defined by “.*?” and not simply by “.*” in regard to regular expressions. . After all, “.*” should suffice so that an arbitrary character (char: “.”) can turn up any number of times (quantifier: “*”). At first glance, the difference between “.*” and “.*? is not easy to recognize; but a short example should make it clear.

Assume that you would like to evaluate month specifications in a logging file, but the months are not all specified in the same way. Sometimes you use the short form, other times the long form of the month name is used. As you’ve seen, that’s no problem for regular expressions, because sub-expressions allow parts of a keyword to be declared optional:

“Feb” -match “Feb(ruary)?”

True
$matches[0]

Feb
“February” -match “Feb(ruary)?”

True
$matches[0]

February

In both cases, the regular expression recognizes the month, but returns different results in $matches. By default, the regular expression is “greedy” and wants to achieve a match in as much detail as possible. If the text is “February,” then the expression will search for a match starting with “Feb” and then continue searching “greedily” to check whether even more characters match the pattern. If they do, the entire (detailed) text is reported back.

However, if your main concern is just standardizing the names of months, you would probably prefer getting back the shortest common text. That’s exactly what the “??” quantifier does, which in contrast to the regular expression is “lazy.” As soon as it recognizes a pattern, it returns it without checking whether additional characters might match the pattern optionally.

“Feb” -match “Feb(ruary)?”

True
$matches[0]

Feb
“February” -match “Feb(ruary)?”

True
$matches[0]

Feb

Just what is the connection between the “??” quantifier of this example and the “*?” if the preceding example? In reality, “*?” is not a self-contained quantifier. It just turns a normally “greedy” quantifier into a “lazy” quantifier. This means you could use “?” to force the quantifier “*” to be “lazy” and to return the shortest possible result. That’s exactly what happened with our regular expressions for HTML tags. You can see how important this is if you use the greedy quantifier “*” instead of “*?”, then it will attempt to retrieve a result in as much detail as possible. That can go wrong:

# The greedy quantifier * returns results in as much detail as possible:
“Contents” -match “]*>(.*)”

True
$matches[1]

Contents
# The right quantifier is *?, the lazy one, which returns results that
# are as short as possible
“Contents” -match “]*>(.*?)”

True
$matches[1]

Contents

According to the definition of the regular expression, any characters are allowed inside the tag. Moreover, the entire expression must end with “”. If “” is also inside the tag, the following will happen: the greedy quantifier (“*”), coming across the first “”, will at first assume that the pattern is already completely matched. But because it is greedy, it will continue to look and will discover the second “” that also fits the pattern. The result is that it will take both “” specifications into account, allocate one to the contents of the tag, and use the other as the conclusion of the tag.

I this example, it would be better to use the lazy quantifier (“*?”) that notices when it encounters the first “” that the pattern is already correctly matched and consequently doesn’t go to the trouble of continuing to search. It will ignore the second “” and use the first to conclude the tag.

Finding String Segments

Entire books have been written about the uses of regular expressions. That’s why it would go beyond the scope of this book to discuss more details. However, our last example, which locates text segments, shows how you can use the elements listed in Table 13.11 to easily harvest surprising search results. If you type two words, the regular expression will retrieve the text segment between the two words if at least one word is, and not more than six other words are, between the two words:

“Find word segments from start to end” -match “bstartW+(?:w+W+){1,6}?endb”
True
$matches[0]

Name                           Value
—-                           —–
0                              start to end

Replacing a String

You already know how to replace a string because you were already introduced to the -replace operator. Simply tell the operator what term you want to replace in a string and the task is done:

“Hello, Ralph” -replace “Ralph”, “Martina”

Hello, Martina

But simple replacement isn’t always sufficient, so you need to use regular expressions for replacements. Some of the following interesting examples show how that could be useful.

Perhaps you’d like to replace several different terms in a string with one other term. Without regular expressions, you’d have to replace each term separately. Or use instead the alternation operator, “|”, with regular expressions:

“Mr. Miller and Mrs. Meyer” -replace “(Mr.|Mrs.)”, “Our client”

Our client Miller and Our client Meyer

You can type any term in parentheses and use the “|” symbol to separate them. All the terms will be replaced with the replacement string you specify.

Using Back References

This last example replaces specified keywords anywhere in a string. Often, that’s sufficient, but sometimes you don’t want to replace a keyword everywhere it occurs but only when it occurs in a certain context. In such cases, the context must be defined in some way in the pattern. How could you change the regular expression so that it replaces only the names Miller and Meyer? Like this:

“Mr. Miller, Mrs. Meyer and Mr. Werner” `
-replace “(Mr.|Mrs.)s*(Miller|Meyer)”, “Our client”

Our client, Our client and Mr. Werner

The result looks a little peculiar, but the pattern you’re looking for was correctly identified. The only replacements were Mr. orMrs. Miller and Mr. or Mrs. Meyer. The term “Mr. Werner” wasn’t replaced. Unfortunately, the result also shows that it doesn’t make any sense here to replace the entire pattern. At least the name of the person should be retained. Is that possible?

This is where the back referencing you’ve already seen comes into play. Whenever you use parentheses in your regular expression, the result inside the parentheses is evaluated separately, and you can use these separate results in your replacement string. The first sub-expression always reports whether a “Mr.” or a “Mrs.” was found in the string. The second sub-expression returns the name of the person. The terms “$1” and “$2” provide you the sub-expressions in the replacement string (the number is consequently a sequential number; you could also use “$3” and so on for additional sub-expressions).

“Mr. Miller, Mrs. Meyer and Mr. Werner” `
-replace “(Mr.|Mrs.)s*(Miller|Meyer)”, “Our client $2”

Our client , Our client  and Mr. Werner

Strangely enough, at first the back references don’t seem to work. The cause can be found quickly: “$1” and “$2” look like PowerShell variables, but in reality they are regular terms of the -replace operator. As a result, if you put the replacement string inside double quotation marks, PowerShell will replace “$2” with the PowerShell variable $2, which is normally empty. So that replacement with back references works, consequently, you must either put the replacement string inside single quotation marks or add a backtick to the “$” special character so that PowerShell won’t recognize it as its own variable and replace it:

# Replacement text must be inside single quotation marks
# so that the PS variable $2:
“Mr. Miller, Mrs. Meyer and Mr. Werner” -replace `
“(Mr.|Mrs.)s*(Miller|Meyer)”, ‘Our client $2’

Our client Miller, Our client Meyer and Mr. Werner
# Alternatively, $ can also be masked by `$:
“Mr. Miller, Mrs. Meyer and Mr. Werner” -replace `
“(Mr.|Mrs.)s*(Miller|Meyer)”, “Our client `$2”

Our client Miller, Our client Meyer and Mr. Werner

Putting Characters First at Line Beginnings

Replacements can also be made in multiple instances in text of several lines. For example, when you respond to an e-mail, usually the text of the old e-mail is quoted in your new e-mail as and marked with “>” at the beginning of each line. Regular expressions can do the marking.

However, to accomplish this, you need to know a little more about “multi-line” mode. Normally, this mode is turned off, and the “^” anchor represents the text beginning and the “$” the text ending. So that these two anchors refer respectively to the line beginning and line ending of a text of several lines, the multi-line mode must be turned on with the “(?m)” statement. Only then will -replace substitute the pattern in every single line. Once the multi-line mode is turned on, the anchors “^” and “A”, as well as “$” and “Z”, will suddenly behave differently. “A” will continue to indicate the text beginning, while “^” will mark the line ending; “Z” will indicate the text ending, while “$” will mark the line ending.

# Using Here-String to create a text of several lines:
$text = @”
Here is a little text.
I want to attach this text to an e-mail as a quote.
That’s why I would put a “>” before every line.
“@
$text

Here is a little text.
I want to attach this text to an e-mail as a quote.
That’s why I would put a “>” before every line.

# Normally, -replace doesn’t work in multiline mode.
# For this reason, only the first line is replaced:
$text -replace “^”, “> ”

> Here is a little text.
I want to attach this text to an e-mail as a quote.
That’s why I would put a “>” before every line.

# If you turn on multiline mode, replacement will work in every line:
$text -replace “(?m)^”, “> “

> Here is a little text.
> I want to attach this text to an e-mail as a quote.
> That’s why I would put a “>” before every line.

# The same can also be accomplished by using a RegEx object,
# where the multiline option must be specified:
[regex]::Replace($text, “^”, “> “, `
[Text.RegularExpressions.RegExOptions]::Multiline)

> Here is a little text.
> I want to attach this text to an e-mail as a quote.
> That’s why I would put a “>” before every line.

# In multiline mode, A stands for the text beginning
#  and ^ for the line beginning:
[regex]::Replace($text, “A”, “> “, `
[Text.RegularExpressions.RegExOptions]::Multiline)

> Here is a little text.
I want to attach this text to an e-mail as a quote.
That’s why I would put a “>” before every line.

Removing Superfluous White Space

Regular expressions can perform routine tasks as well, such as remove superfluous white space. The pattern describes a blank character (char: “s”) that occurs at least twice (quantifier: “{2,}”). That is replaced with a normal blank character.

“Too   many   blank   characters” -replace “s{2,}”, ” ”

Too many blank characters

Finding and Removing Doubled Words

How is it possible to find and remove doubled words in text? Here, you can use back referencing again. The pattern could be described as follows:

“b(w+)(s+1){1,}b”

The pattern searched for is a word (anchor: “b”). It consists of one word (the character “w” and quantifier “+”). A blank character follows (the character “s” and quantifier “?”). This pattern, the blank character and the repeated word, must occur at least once (at least one and any number of iterations of the word, quantifier “{1,}”). The entire pattern is then replaced with the first back reference, that is, the first located word.

# Find and remove doubled words in a text:
“This this this is a test” -replace “b(w+)(s+1){1,}b”, ‘$1’

This is a test

(source : http://powershell.com/cs/blogs/ebook/archive/2009/03/30/chapter-13-text-and-regular-expressions.aspx#regular-expressions)