(This post is a more in-depth follow-up to my original blog entry on the subject: Enabling Custom Phone Number Normalization with the Address Book Service.)
I recently took a short hiatus from the OCS TechNet discussion forums while concentrating on an Exchange project for the past few months. While catching up on threads, I’ve noticed that there is still quite a lot of confusion regarding both how OCS handles phone numbers in the directories and how to understand the normalization rules. This is not really surprising as not only is the product documentation a little light in these categories, but the sample Address Book Normalization file probably couldn’t be any less confusing. Reverse-engineering some of those sample rules is difficult, and the patterns used don’t even match how PSS recommended they be written. (In my personal experience, at least.)
So I’ll start off will a quick outline of how Office Communications Server 2007 and the Office Communicator 2007 (2.0) client handle phone numbers and then dive into some basic and complex translation rules.
Behavior
The OC 2007 Enhanced Presence Model White Paper has a very important paragraph on page 7, which basically states that many of the Active Directory attributes (Title, Work Phone, Mobile Phone, Home Phone, Other Phone, Company, Office, Work Address, and SharePoint Site) are visible to all contacts in the company regardless of what Access Level is granted to an individual. For Federated contacts Access Levels still control the sharing of this AD-populated data, but PIC contacts will still never see any of these fields. I believe the idea is that if all that information is already populated in AD, thus viewable by all users in the Global Address List via Outlook, then there is no reason to hide the information. But I think this should still be customizable behavior, as there is a big difference between looking at someone’s phone number in the the Address Book and accidentally one-click dialing your VP’s mobile phone when using the Office Communicator client. This default behavior has caused a scramble to remove non-work phone numbers from AD during some client deployments.
That said, lets look at how OCS handles and processes a specific telephone number, as in a user’s Home Telephone number:
The AD attribute homePhone is already populated:
- The Address Book Service processes the value and attempts to normalize it.
- If correctly normalized then the number is inserted into the OCS address book.
- Communicator will only display this normalized number if it is properly formatted in E.164 (+13125551234).
- All OC clients will see this number regardless of their access level.
- Users can not disable the publishing of these numbers with their client Phones options.
- If not in E.164 format then the number will not appear in OC.
- Communicator will only display this normalized number if it is properly formatted in E.164 (+13125551234).
- If the number fails to normalize then it will not be written into the OCS address book.
The AD attribute homePhone is not populated:
- If correctly normalized then the number is inserted into the OCS address book.
- The Address Book Service ignores this attribute
- A user can enter their Home Phone number and choose to publish it..
- Note: this does not publish the number into Active Directory, only the OCS address book.
- Only contacts added to the user’s Personal Access Level will see this number in Communicator.
I’ve created this flowchart to better illustrate the observed behavior:
Understanding Normalization Rules
It is important to note that OCS has two places where phone number normalization rules can be utilized, and it depends on the type of deployment: Enterprise Voice (EV) or Remote Call Control (RCC).
- When utilizing RCC, any rules added to an EV dial plan are completely ignored by OCS; the Address Book Service must be used for normalization. The opposite holds true for EV deployments, to a certain extent as dialing behavior isn’t affected the same way, appearance of numbers in internal directories still needs to be considered.
- Rules created in the dial plan for EV need to be encapsulated with ^ and $ characters, but these are not required in the Address Book Service’s configuration file (Company_Phone_Number_Normalization_Rules.txt) as the ABS automatically inserts them when it processes the file.
- Rules created for the Address Book Service are applied to all numbers in Active Directory by the ABS, while numbers entered into the OC client and pulled from a user’s Outlook Contacts are processed by the these same rules, but by the OC client itself. (The rules are downloaded to the client and stored in the registry during sign-in.)
The single best benefit to utilizing the Address Book Service for normalizing numbers is if for some reason the data cannot be corrected in the source: Active Directory. For the best OCS experience it is recommended to reformat AD phone number attributes to the standard E.164 format, but if this is not an option then the ABS can be used to ‘fix’ the numbers so that OCS can at least display and dial them as needed in a specific implementation. (Keep in mind that reverse-number lookup may be adversely affected.) Due to the E.164 requirement in OCS you may need to create additional rules on the connected PBX system to drop leading characters that OCS send in case the PBX is expecting numbers in a different format. Typically the PBX is expecting number in either a 10 digit + prefix format for external calls (918005551212) or a short format denoting an internal extension (2454). If OCS is normalizing and sending numbers as +18005551212 and +2454 then the PBX would need rules to strip the +1 and replace it with 91 for external calls (assuming 9 is needed to dial-out in this scenario) and just strip the + from the internal call.
The foundation for understanding and creating custom normalization patterns are Regular Expressions, which are special text patterns describing search patterns. Each normalization rule is comprised of two strings, the Phone Pattern and the Translation Pattern. The Phone Pattern is written to match the incoming phone number, depending on it’s source (AD, Outlook Contact, manually entered into the OC Find bar, etc), while the Translation Pattern is how we want the outgoing number to formatted.
If this idea is completely foreign then I suggest following the link above to read-up on Regular Expressions, as well as review the OCS Deployment Documentation and other online resources. Assuming that you understand how expressions like [0-9] and d{4} work, then lets move on.
Creating Address Book Rules
Most of the OCS documentation covers simple expression patterns that handle incoming number strings as dialed numbers, so something like ^(312)(d{7})$ will correctly handle the string 3125551234. But it will not handle a number pulled from Active Directory in the format (312) 555-1212. Since the Address Book Service needs to read in potentially thousands and thousands of phone numbers from Active Directory, the format of those attributes yet again becomes important. So again, if you can get E.164 format forced throughout AD then you are already ahead of the game, but if that is no possible then things could get a bit messy. Some companies allow non-Administrators to maintain phone numbers by creating a custom web portal that allows the control of certain attributes. These solutions typically force a standard format throughout all values of a specific attribute. This is beneficial on that a simple phone pattern can be used since the format is known.
Let’s say your particular AD infrastructure is not so standardized and there are any number of privileged individuals entering phone numbers attributes in whatever format their hearts desire: some with parenthesis, some without, some with dashes, some with excessive spaces, etc. This could create a mountain of carefully-ordered normalization rules for all the possible formats that the ABS would be required to deal with. Luckily a single expression string can be written to handle almost any possible combination of characters, assuming at least the correct number of digits and order is used.
##
## Normalize all AD phone numbers to E.164
##
+?[s()-./]*1?[s()-./]*(?s*(ddd)s*)?[s()-./]*(ddd)[s()-./]*(dddd)[s]*
+1$1$2$3
Now if we dissect the entire rule, it’s much easier to understand exactly what it is doing each step of the way:
EXPRESSION | ACTION |
+? | Ignore the first character if it is a + |
[s()-./]* | Match any immediately following characters if they are a space ( ) dash or period |
1? | Ignore the next character if it is a 1 |
[s()-./]* | Match any immediately following characters if they are a space ( ) dash or period |
(? | Ignore the next character if it is an open parenthesis |
s* | Ignore any number of repeated spaces |
(ddd) | Capture the first 3 digits and store as the first variable. |
s* | Ignore any number of repeated spaces |
)? | Ignore the next character if it is a closed parenthesis |
[s()-./]* | Match any immediately following characters if they are a space ( ) dash or period |
(ddd) | Capture the next 3 digits and store as the second variable. |
[s()-./]* | Match any number of immediately following characters if they are a space ( ) dash or period |
(ddd) | Capture the last 4 digits and store as the third variable. |
[s]* | Ignore any number of repeated spaces |
+1 | Insert +1 into the translation pattern |
$1 | Insert the value of the first captured variable |
$2 | Insert the value of the second captured variable |
$3 | Insert the value of the third captured variable |
Using this very flexible rule, strings like (312) 555-1234 or +1 (312)555 – 1234 or even something wacky like +—1( ( ))312 . 555)–1234-)(.- . .) would all be normalized into +13125551234. Now that we have a very general rule designed to normalize AD phone numbers into a format that will both correctly populate the OCS address book, and display correctly in the OC client, let’s look at creating some more specific rules to handle proper routing of internal numbers and maybe some local exchange or local area code numbers.
Assume your company has a disconnected number space, which is quite typical given future expansion or changes in local or incumbent exchange carriers. Here are some examples for different contiguous number blocks which are translated into 4-digit extensions for internal PBX routing.
#
# 312-555-9500…9599
#
(?s*(312)s*)?[s()-./]*(555)[s()-./]*(95dd)[s]*
$3
#
# 312-555-3540…3569
#
(?s*(312)s*)?[s()-./]*(555)[s()-./]*(35[4-6]d)[s]*
$3
#
# 312-555-8120…8127
#
(?s*(312)s*)?[s()-./]*(555)[s()-./]*(812[0-7])[s]*
$3
There rules will allow for OCS to send only the 4-digit extensions to the PBX when dialing numbers within those ranges, keeping the call routing internal.
Testing the Rules
The configuration file allows for simple testing of the rules, as can be seen at the end of the ABS Sample configuration file installed by default in OCS. In order for the test rules to function they must be commented out . Simple enter the TestInput value to match exactly how the a number would stored in AD, and then enter what the expected result should be for the TestResult value.
#
# Test strings used with the "abserver.exe -testPhoneNorm" command to verify each rule
#
# (All Test strings below should be commented-out for proper operation, do not remove the initial ‘#’)
##TestInput: (312) 555-9500 TestResult: 9500
#TestInput: (312) 555-3551 TestResult: 3551
#TestInput: (312) 555-8126 TestResult: 8126
By executing the abserver.exe -testPhoneNorm command, each rule included in the configuration file will be processed, top-down, to look for the best matching normalization rule and then return the results:
If the returned results match the expected TestResult parameter, then Test PASSED would be the result. If the test fails, then look at the actual result to see if either there is a problem with the normalization rule or the order of the rules in the configuration file. The first rule from the top that fits the pattern will be used, so make sure and put the most specific rules toward the top and most generic toward the bottom.
Each time the Address Book Service regenerates (1:30AM by default) it may create a new Invalid_AD_Phone_Numbers.txt file in the same Files subdirectory where the the configuration and client address book files are stored. Each attribute which the ABS was unable to find a sufficient normalization rule for will be written to this file.
Unmatched number: User: ‘jeff’ AD Attribute: ‘homePhone’ Number: ‘555-2299’
Unmatched number: User: ‘jeff’ AD Attribute: ‘telephoneNumber’ Number: ‘4774’
These numbers are not in a 10-digit format; they either need to be fixed in AD or additional normalization rules added to handle the 7 and 4 digit formats.