In the January 2013 Cumulative Update for Lync Phone Edition (a.k.a. CU7) Microsoft changed a key element of the client registration process. Any Lync Phone Edition (LPE) firmware version from 4.0.7577.4363 and newer is affected by the behavior described in this article.
Adherence to Strict DNS Naming was introduced into the Lync server discovery process that the devices use which was not present in previous versions. What this means is that if the required Service Locator (SRV) and accompanying Host (A) record for a SIP domain are not configured in the same domain namespace then the phone will not be able to successfully sign-in.
Background
In previous Office Communications Server releases it was not possible to configure the Automatic Configuration records in different domain namespaces as the Office Communicator clients would fail to sign in. Although it was possible to disable Strict DNS Naming doing so would not actually allow the clients to work in this scenario as this capability was limited to supporting domain suffixes. Meaning that if strict naming was disabled for the clients then they would sign-in only if the names were confined to the same root namespace. For example: domain.com and child.domain.com would be acceptable. But mismatched names in entirely different namespaces like contoso.com and nwtraders.com would still not be supported.
Traditionally when the term ‘Strict DNS Naming’ is used in Lync the following configuration would be applicable. Both the SRV and Host records are configured in the same domain namespace. This configuration model has been a best practice for a long time now and was a requirement for OCS.
_sipinternaltls._tcp.mslync.net > sip.mslync.net
But Lync 2010 introduced the flexibility to handle mismatched records in completely different namespaces and the Lync clients would still connect to the server, albeit followed by a prompt asking the user to trust the server or not. Normally this configuration would not be ideal but in service provider scenarios or when certificate Subject Alternative Name space is limited then additional entries for numerous SIP domain names may be undesirable.
_sipinternaltls._tcp.mslync.net > sip.schertz.name
Lync Phone Edition Behavior
As mentioned the mismatched configuration is no longer allowed as of the CU7 releases for the various Lync Phone Edition devices. This limitation not only applies to SIP registration to the Lync server but also impacts Exchange integration. When the phone performs Exchange Server Autodiscovery if the same cross domain name configuration is in place then the devices will be unable to authenticate to the Exchange server and will display the dreaded exclamation mark icon indicating an Exchange integration failure.
This behavior is as-designed and will not be reversed, according to the Lync product team. Lync Phone Edition devices are only supported for on-premises and Office 365 Lync Online deployments; Lync Hosting Pack and other multitenancy environments are not officially supported, hence the requirement to adhere to strict DNS naming practices.
Update [7/2/2014]
After receiving some conflicting reports about whether or not Microsoft has changed this behavior in later releases some further testing was performed and an additional, interesting behavior was discovered.
Firstly the behavior has not changed since CU7 (4.0.7577.4372) through the latest release of CU12 in April 2014 (4.0.7577.4444). Microsoft has documented this in Knowledge Base Article ID 2933146 and their guidance is the same as above, in terms of ‘fixing’ the SRV/A record pairing to match as is best practice.
But if this is not addressed there is still another method that Lync Phone Edition can use to successfully connect. Simply use one of the Host (A) fall-back records like sip.domain.com or sipinternal.domain.com. Lync Phone Edition will attempt to resolve all DNS SRV and A records and in the event that the provided SRV record does not pass the strict naming check the device will fall-back to using the hardcoded host records. Thus in the example above if the records are left mismatched simply create an additional host record of sip.mslync.net and make sure that this FQDN is included in the Lync Server certificate. This is just yet another reason to continue to include the recommended (but not mandatory) sip.<sipdomain> record in Lync deployments, even if it’s not leveraged directly by the SRV record.
There is a exception from this behavior of strict domain matching:
As seen in logs, when Lync Phone Edition gets server name from autodiscover records, and received server name is from domain outlook.com or lync.com – phone firmware bypasses mismatch check and trusts this server.
This is how Lync phone edition can work with Office365.
All other domain names required to match with SIP address.
Yes, thanks for adding this info. As of CU7 Microsoft updated LPE to support Office 365 connectivity, but this doesn't help with any other hosting providers or other multi-tenancy deployments where the DNS Service Locater records for every individual SIP domain point to the same single Front End, Director, or Access Edge FQDN.
This is login process from phone logfile:
WARN :: NModel::CTrustModelManager::LoadFromString: Read 2 domains from string data: outlook.com, lync.com
WARN :: NModel::CTrustModelManager::LoadFromStringArray: Trust model loaded: 00ADE500 domain=outlook.com, trustState=1, persist=0
WARN :: NModel::CTrustModelManager::LoadFromStringArray: Trust model loaded: 00ADE560 domain=lync.com, trustState=1, persist=0
WARN :: NModel::CTrustModelManager::LoadFromString: Reconstructed 2 trust models from the string data
….
INFO :: DoesDomainMatchServer: DoesDomainMatchServer-no match(USERSIPDOMAIN.COM, edge.HOSTER.COM)
INFO :: DoesDomainMatchServer: DoesDomainMatchServer, ret=0, (outlook.com, edge.HOSTER.COM)
INFO :: DoesDomainMatchServer: DoesDomainMatchServer-no match(lync.com, edge.HOSTER.COM)
WARN :: NModel::CTrustModelManager::LookupTrustModel: Trust model for server edge.HOSTER.COM not found. hr=0x80ee0058
Useful post!
Thank you! I spent the last 3 weeks trying to figure out why brand new phones would not register. God Bless Microsoft for undocumented changes. I thought I had a certificate issue.
Does this update impact Lync 2010 deployments with resilient SRV records pointing to a primary and backup pool? I've had a setup functioning just fine and recently phone registrations have failed. It's telling that we updated to CU8 firmware around the time of the issue. In a lab test imtried, if I point the SRV sip.company.com instead of pool.company.com them CU8 works, but when I used pool.company.com login failed. I am wondering if the CU7+ relase is running even more stringent test on the SRV pointer.
Have you tried a deployment with resilient SRV records?
Jed
As far as I know it is only applicable when the records do not match domain names; this may be another issue altogether.
Other than posting this comment, is there a place we can register our support for a "rollback"? I appreciate Microsoft's focus on security but unannounced breaking changes to the product make it very hard to drive Lync adoption. If they must do something, the behavior in the phone should be the same as the traditional clients, a prompt to acknowledge.
I have several users who need the MWI updates in the April 2013 CU but I can't release it because it breaks Exchange integration. Better together indeed!
As far as I know Microsoft should be fixing this issue in a future cumulative update but I have no confirmation of this. I suggest opening a support ticket with Microsoft to register your request with them directly. The more open cases they have for this issue the sooner it can be addressed.
In the meantime, are all new Polycom phones shipping with this latest firmware installed?
No, there will still be an older 4.x software release on the phone for some time yet.
Has anyone tested the October CU to see if this issue is addressed?
http://support.microsoft.com/kb/2889246
Microsoft has thus far opted to leave the configuration as is. Since LPE is not supported in LHP (Lync Hoster Pack) multi-tenancy environments then the need to support this is not really imperative. At this point I do not see this behavior changing and LPE will probably continue to force strict DNS naming configuration, which are always best-practice.
I have named my Exchange autodiscover hostname so that it matches the AD domain of both my Lync FE and Exchange server. Based on the verbose logging, I had expected that to work. Unfortunately, it still doesn't seem to be trusted.
NAutoDiscover::DnsAutodiscoverTask::PopulateAutodiscoverUrlsFromDnsSrv: SRV record found for record, _autodiscover._tcp.xyztr.com, value, autodiscover.xyzre.net
DoesDomainMatchServer: DoesDomainMatchServer, ret=1, (xyzre.net, autodiscover.xyzre.net)
NAutoDiscover::DnsAutodiscoverTask::TryAutodiscoverUrls: Server is autodiscover.xyzre.net not trusted, hr=0x0.
NAutoDiscover::DnsAutodiscoverTask::PerformAutodiscovery: DNS autodiscover failed
NAutoDiscover::AutodiscoverTaskBase::OnExecution: Autodiscovery failed. hr=0x80004005.
Does anybody know what the return codes for DoesDomainMatchServer() mean? Should I be getting a 1 for this?
Hi, Great Info by the way, Jeff Schertz is AMAZING!!
Any way I am looking for a workaround for the issue above as I am experiencing the same issue when deploying Common Area Phones, different FE DNS Name Space (AD DNS Name Space), compared to the user's SIP Domain, is there any way we can disable DNS Strict Name Matching using GPO for the Contact Objects for each CAP that is on CU8 ?
Thanks
Mem
Thanks. There is no way to change this behavior but you should be using matched SRV/A records even internally in your SIP domain to support automatic sign-in regardless. The Pool name can still be in your AD namespace but you should still have a sip.<sipdomain> record to point the SRV record to.
That's what I thought, the customer has there SRV record _sipinternaltls._tcp.abc.com pointing to all director servers a records (10 in total) as dir1.ad.domain.local through to dir10.ad.domain.local
The sip.abc.com a record points to only 2 host a records which is dir1.ad.domain.local to dir2.ad.domain.local ??
So my initial understanding was correct and basically DNS is not configured correctly?
I am seeing in the cu8 device logs, is this due to the fact the srv records points to the Directors mentioned above and in turn the directors returning the fepool fqdn to the client ?
DoesDomainMatchServer: DoesDomainMatchServer-no match(abc.com, fepool1.ad.domain.local)
WARN :: NModel::CTrustModelManager::LookupTrustModel: Trust model for server fepool1.ad.domain.local not found. hr=0x80ee0058
Correct, the DNS records need to be fixed in order to support LPE in CU7 or CU8.
Thank you Jeff Schertz, I am honoured to have your presence. I will Lab this tonight! Thanks again.
Does anybody know if this is already fixed? I have latest CU (build 4414) and that seems to not worried about that (Astra 6731ip). But older version of Polycom CX600 are unable to sign-in. Or how to get Lync device updated if the device is unable to sign-in?
This behavior has not been changed as of CU10. To update the phones without registration see this article: http://blog.schertz.name/2013/05/updating-lync-ph…
And one question comes to my mind as well.. DHCP. Where do we need SRV records (for phones) if we are using DHCP?
DNS and DHCP are completely separate. The SRV records are used for locating the server to REGISTER to, while the DHCP 43 options are used initially to locate the Lync Certificate Provisioning web service.
Are you sure?
I thought the "SIP server" option is for the pool and "Web Server" is for the certificate (as web services). This could be case when you have DNS load balancing in use and HLB for the HTTPS traffic. Then you cannot have the same FQDNs for pool and web services.
And actually, when I'm looking for the client logs. In the logs the device says "pool.internal.com" is not trusted while SRV records pointing to the sip.company.com. So it must use the "SIP Server" for finding the Lync infrastructure.
And I believe this is a hidden problem. As if you have older Firmware builds they are able to sign-in still because devices have the certificate already downloaded. Perhaps there is no problems either when the certificate expires, because as an authenticated it can still download the new.
This can be tested by: removing the certificates from server and then reboot the device using * & #. After that my device was not able to sign-in anymore.
DHCP Option 120 is only used during PIN Authentication to locate the SIP registrar. When signing in with user credentials via USB the phone only uses SRV/A records for SIP server discovery base don the SIP domain provided in the user's SIP address.
Ok, currently my tests looks like if the option 120 is not following this new restriction rule, your phone devices are not able to sign-in. The device says on the log: "…not trusted…". But I will see when I have changed the DNS, DHCP and certificates so that new value for 120 could be pool.company.com.
If that is the case, there are some errors on Microsoft's guidelines for the DHCPUtil: The SIP server should be SIP server's FQDN, but this seems to be incorrect especially if you have multiple domains in use or you are using Standard Edition. Its value must be connected to the SIP domain you are using for your phones.
But let see…
Ok, tested, 120 is not only for the PIN Authentication. Instead of what Microsoft says about DHCPUtil you must NOT use SIP server's FQDN on the option 120. You have to use either "sip.Domain.com" or "pool.Domain.com" (or basically: whatever.Domain.com), assuming your phone devices are using SIP addresses as "whatever@Domain.com". Otherwise the Lync Device will mark the pool/server as "not trusted" and then it try to find SRV records and other methods to get the device up.
If you are using Enterprise Pool you must remember to separate the Web Service traffic via the HLB (-webserver parameter on DHCPUtil).
Can someone say why CU07 description says nothing about this??? http://support.microsoft.com/kb/2819315
As I've always adhered to the strict naming practices I've never run into this. Good to know the additional behavior.
What if the DHCP offers pool01.internaldomain.com and SRV records points to sip.company.com?
I'm not sure; I'd have to test that as I have a few different guesses as to what might happen in that scenario.
Hello Guys,
Anyone found a solution ?
I have that scenario:
AD DOMAIN: domain.local
SIP DOMAIN: domainA.com
SIP DOMAIN: domainB.com
DHCP offer 120 my sba local mysba.domain.local
The SRV _sipinternaltls._tcp.domainA.com => sip.domainA.com
sip.domainA.com = IP of mysba.domain.local
The certificate of SBA is:
SN: mysba.domain.local
SAN: sip.domainA.com
SAN: sip.domainB.com
But the LPE dont work
Verify the log of Phone I see the error:
ERROR :: OUTGOING_TRANSACTION::OnRequestConnectionConnectComplete – connection failed error 80ee0065
And Log os SIPStack
The connection was closed before TLS negotiation completed. Did the remote peer accept our certificate?
I guess that PHONE doesn't download the chain certificate.
LPE should download the certificate even in SBA scenarios (it will receive it from the main pool’s webticket service.
Anyone found a working solution for this yet ?
I use sip.companydomain.xx in DHCP option120 and SRV records.
sip.companydomain.xx have a CNAME record to lyncserver.internal.xx
I have also tried sip.companydomain.xx as an Arecord pointing to the ip address of lyncserver.internal.xx
sip.companydomain.xx is a SAN on the Lyncserver cert. but the lync server FQDN (standardf edition) is still lyncserver.internal.xx
any help ??
In re-working my Exchange Autodiscovery URLs and re-issuing the certs with an internal CA, I think I've discovered that my Lync Phone Edition devices aren't trusting my AD-published CA.
This page describes the certificate "discovery" process, but I don't see it happening in packet traces from my CX600.
http://technet.microsoft.com/en-us/library/gg3982…
I see an LDAP query like the following, but no queries after that point. Any tips on how to dig deeper?
User Datagram Protocol, Src Port: 49161 (49161), Dst Port: ldap (389)
Connectionless Lightweight Directory Access Protocol
LDAPMessage searchRequest(1) "<ROOT>" baseObject
messageID: 1
protocolOp: searchRequest (3)
searchRequest
baseObject:
scope: baseObject (0)
derefAliases: neverDerefAliases (0)
sizeLimit: 0
timeLimit: 1
typesOnly: False
Filter: (&(&(DnsDomain=xyz.net)(Host=WindowsCE))(NtVer=0x00000006))
attributes: 1 item
AttributeDescription: Netlogon
[Response In: 897]
This post is not valid any more. Lync Phone edition works perfectly well with different DNS namespaces, i.e. sign in address: username@domain.com and SRV record pointing to frontend.domain.local. Tested with Aastra and HP phones (May 2014)
Michael, Microsoft has not changed the behavior and does not plan to. What you are most likely seeing is that if you have any DNS Host fallback records (e.g. sip.domain.com) in the user's SIP domain then the phone will use this record even if the previously resolved SRV record does not pass the strict DNS testing.
This post is still valid. We had an issue this week with brand new Polycom CX600 phones on a fully patched Lync 2013 install.
One thing that confused us was that old Polycom CX600's that we had onsite would connect ok, they had the same firmware as the new ones so we couldn't work out what the issue was. Maybe if we had of factory reset them they would have also failed.
Hey Jeff,
We have recently added a second SIP domain. Users with this domain receive "sign in canceled due to internal error" using PIN authentication and "Cannot sign in….." using tethering.
Hoping you might be able to clarify a few things for me to assist my troubleshooting:
1. When using PIN authentication with option 120, are SRV records used or does the phone just connect directly to the FQDN provide in option 120? If this FQDN is in the primary sip domains namespace is this a problem, and do I need a DHCP scope to cater for each?
2. Can the fallback record for sip.secondarydomain.com be a CNAME to sip.primarydomain.com?
3. Can you confirm if the device tries all connection methods regardless of whether an error is received during the process?
Any suggestions would be much appreciated.
Option 120 is only used for locating the Lync Pool to send the PIN Authentication request to to issue a client certificate, so the SIP domain of the user attempting to sign in is irrelevant at this step. When LPE then attempts to register to the server the phone still reverts to the hardcoded SRV/A lookup behavior to find the correct pool for that user's specific SIP URI, even if it's the same server. Only a single DHCP configuration is needed/supported. The fallback record should be a Host (A) record pointing to the pool server IP address(es), I'm not sure if a CNAME will work although I suspect it still may (although LPE does follow its own rules). And yes LPE will move past failed connections (unlike the Lync desktop client) so if the SRV record is resolved but the server connection attempt fails (or the strict naming check fails) it will move on to the A fallback records. This is what I explained the last paragraph in the recent update.
Hey Jeff, thanks very much for the clarification it is very helpful to better understand DHCP vs DNS. I have been toying with the logs from the device and it seems to confirm that there is a DNS issue or mismatch with valid fallback as a SAN. Those phone logs are damn hard to read!! I will report back, once I find the solution.
Hey Jeff, so turns out it was a DNS issue relating to strict domain matching! I ended up writing a tool to read the logs from LPE devices. There is actually some really useful information in the log once you filter out all the rubbish. I'd love to get your feedback on the tool when you have a chance – http://www.lync.geek.nz/2014/09/lync-phone-editio…
Andrew, I'll definitely take a look at the tool when I get a chance. Anything is better than using readlog 🙂
Hey Jeff, I spotted a possible conflict and wanted to get some clarity. Under the Resiliency section here – http://blog.schertz.name/2012/03/troubleshooting-lync-phone-edition-issues/, you mention that option 120 should “be directed towards the SBA registrar via DHCP Option 120”. However the way I read the above comment, suggests that option 120 is related only to PIN auth requests, and thus should be pointed to the FE pool as the SBA doesnt have the PIN Auth service?
Hi Jeff
Thank you for a fantastic blog! And all the help you offer.
I have a big problem with my first implementation of LPE phones.
I´m required to install both CX500 and CX600.
The domain is .local and the Lync SIP domain I use is .com
The installation is and Standard Server
I have used the DHCPutil tool on the DHCP sever with parameters DHCPUtil –WebServer Standard Lync server FQDN. So the 120 and 43 are point to the FQDN of the Standard server (.local). This is correct? All the test with emulate and Test-CsPhoneBootStrap works!
I have gone through all the settings according you blog and can see in the IIS log that Webb ticket is published for authentication. I even succeed to download latest update to the phone after I have put in the ucupdate SAN name in the internal Lync server certificate.
Still I have a sign-in issue on USB connected CX600 and Common Area Phone pin code.
In the OCS logs I se. The connection was closed before TLS negotiation completed. Did the remote peer accept our certificate?
The time on the on the phone is correct in minutes but not hours. This should not effect?
But then I found Microsoft 2933146 and your blog regarding strict DNS name.
I have today the following DNS
Lync server name .local -> IP address
Lync server name .com – > IP address
sip.domain.com -> IP address
_sipinternaltls._tcp.domain.com -> Lync server FQDN.com
So if I change the _sipinternaltls._tcp.domain.com to sip.domain.com?
But for PIN login on Common Hardware is option 120 used? Have I done wrong on DHCUtil install specification for the SIP and Webserver?
BR Daniel
Most likely the sign-in fails due to your domain name mismatch. You cannot point the SRV record (domain.com) to an A record in another domain (domain.local). The easiest way to resolve this is to publish the “sip.domain.com” record in DNS and on your FE server’s certificate and the phone should fail back to connecting with that record after the SRV record fails the domain match.