Windows Fabric Gone Wild!

Amanda Debler:

Does anyone know WHY Windows Fabric would generate logs like mad? Our Lync Front End servers have about 10 of these per DAY right now. I’ve not turned off the logging or changed it to circular as I want to have the logs on hand to send Microsoft if needed – instead, I’m moving them over to an empty volume with this script running as a scheduled task each night.

if (-not (Test-Path "E:\Windows Fabric Traces")) {
	New-Item -ItemType Directory -Name "E:\Windows Fabric Traces"
}
$fabrictraces = Get-ChildItem "c:\ProgramData\Windows Fabric\Fabric\log\Traces" | where { $_.name -like "fabric_traces_*"} | sort -Descending LastWriteTime
#skip the first two - one of them is the trace file in use, the other is the most recently full one
for ($i=2; $i -lt $($fabrictraces.count); $i++) { $fabrictraces[$i] | Move-Item -Destination "E:\Windows Fabric Traces\" }

$leasetraces = Get-ChildItem "c:\ProgramData\Windows Fabric\Fabric\log\Traces" | where { $_.name -like "lease_traces_*"} | sort -Descending LastWriteTime
for ($i=2; $i -lt $($leasetraces.count); $i++) { $leasetraces[$i] | Move-Item -Destination "E:\Windows Fabric Traces\" }

However, this is only treating the symptoms, not the cause, so the search continues…

Originally posted on Thoughts From a Bot Named Flinch:

CrashHow’s that for the title of a blog article! Apparently I’ve been reading too much Huffington Post or something. For the record, I never read that website. I have standards, as low as they may be.

So back to the title and the point of this post. Are there actually hidden log files that could cause some unintended problems with your Lync 2013 environment? Absolutely. I am assuming you are already aware that IIS logs could fill up your local hard drive. It is also a good idea to keep an eye on the trace files created by OCS Logger and Snooper.

However, there are some hidden logfiles that are created by Windows Fabric that could very much fill up your hard drive and it would be a decent challenge to find them. If you are unaware, Lync 2013 sits on top of a technology called Windows Fabric. For a…

View original 381 more words

Lync Phone Edition PIN Authentication and Cisco ACE Load Balancer – It’s About the Certificate Chain Group

This is truly the article I would have LOVED to have found when we first got the DHCP settings in for our Lync Phone Edition devices and other Lync phones, and were going crazy trying to figure out why the LPE devices were fine right after being tethered to a PC, but were not if someone logged in and out of them while disconnected and after rebooting. And that I sort of promised to write when I was raving about a certain switch.

The symptoms: Test-CsPhoneBootstrap works flawlessly. Other Lync phones can log on with extension and PIN. Your Lync Phone Edition device (in our case, the Polycom CX3000) will cheerfully log on with the extension and PIN if you’ve logged it in tethered to a PC via the PC’s Lync client first, but gives you “An account matching this phone number cannot be found. Please contact your support team” after a very quick flash of another error, “Account used is not authorized, please contact your support team” for the very same extension and PIN if you’ve logged out of the device and powered it down. I did what another admin did, taking a video on my phone, then replaying it really slowly – the time from entering the PIN to getting the final failure message was less than 4 seconds, and that was necessary to see the first failure message that briefly flashed on the screen.

Some WireSharking showed that when it was connected to the PC, or had just been, it was getting two certificates during the handshake before certificate provisioning: the server certificate for the Lync pool and the intermediate CA that issued that certificate, but was only getting the server certificate when it was disconnected, logged out and restarted, with the failure message on the device’s screen seconds after. (Hat tip and a virtual case of tasty Bavarian beer to Drago Totev of Unify Square for pointing this out!)

Apparently, LPE is not quite clever enough to find the intermediate CA certificate itself, though it gets the root CA certificate from Active Directory.  By contrast, the AudioCodes and snom phones we were testing at the same time do fine once they get DHCP Option 43 and 120 when they expect to: during the “DHCP Inform” round, after they’ve gotten their IP addresses. Either they were a bit more resourseful than the LPE devices, or are a bit more (too?) trusting… will let you know if I ever find out.

The LPE devices got server+intermediate cert during the handshake when we changed the pool’s DNS entry to point directly at the IP of one of the Front End servers instead of the IP on the Cisco ACE HLB (thanks again for that suggestion, Drago), and LPE phones were happily logging in with extension and PIN even after a hard factory reset (4 and 6 held down while rebooting – this WILL scrub your latest firmware update!). Once we set the DNS entry for our Lync pool back to the IP address on the HLB, we got same login failures as before. Note: just changing the DHCP template option 43 to use one of the Lync Front End server names instead of the pool name will NOT work – the certificate provisioning service uses the pool name (I tried that.)

So now it was clear: somehow, we had to get the Cisco HLB to give out the intermediate certificate along with the server certificate when making the handshake with the LPE devices. I tried exporting the whole chain as P7B (no private key) and PFX (with private key), then using OpenSSL to make the PEM files the Cisco HLB knew how to consume. The Cisco HLB recognized the certificates, but was still only giving out the server certificate. For reasons that are obvious, Cisco is not falling all over themselves to tell you how to make Lync work with one of their older HLBs ;)

The answer: Certificate Chain Group associated with the interface instead of just the server certificate.

Since I’m not going to pretend that I know anything about Cisco stuff, here’s a link to the manual, and you can get your network guy (or gal) to figure out what it means for you and your certs: http://www.cisco.com/c/en/us/td/docs/interfaces_modules/services_modules/ace/vA5_1_0/command/reference/ACE_cr/chaingrp.html

With that now in place, WireShark showed the the LPE devices are receiving two copies of the server certificate, along with one each of the intermediate CA cert and the root CA cert. I have a feeling that the Cisco ACE is passing along both the chain group and the server certificate, and that we probably could have gotten away with just having the server cert and the intermediate CA cert in the chain group, but hey, it works, and I don’t really feel like hassling my network guy any more about this!

Even though we have DNS load balancing in place for SIP traffic, the configuration of the HLB is still critical for phone authentication, because the certificate provisioning service is a web service. If we did not have the intermediate CA (totally NOT recommended – your root CA should be turned off, disconnected, locked up somewhere and only taken out to make new intermediate CAs that actually do your issuing), we most likely would not have had this issue.

If you are having issues like this and are using a Cisco ACE for load-balancing, first, check your configuration against the details in this post by Andrew Travis. Then, see about a certificate chain group instead of just the server certificate.

WireShark for Dummies Clever Lync Admins some other time when I’m in a screenshots mood.

Comparing Lync Policies – or How to Flip Just About Any Array of Hashtables in PowerShell

If you are reading this blog and can read German, I don’t need to tell you about msxfaq.de, former Exchange and now Lync MVP Frank Carius’ online (but not very alphabetical) encyclopedia of Exchange and Lync – it probably gets more page views in a day than this blog ever has. Even if you cannot read German, you have still probably run into it when searching for Exchange or Lync topics and then seriously wished you could read German – machine translation only goes so far.

Anyhow, one of the most helpful things he’s put out there and that I use all the time is a Swap-Table script. I wasn’t able to turn it up with “flip table in PowerShell” or “pivot PowerShell table” or any of several variations, so this is a little attempt to make that wonderful file findable for the English-speaking world. Scroll to the bottom and look for the “Code” section. You can make it a function in your PowerShell profile by putting the contents of that text file inside the curly braces {} of the following (code not posted here because plagiarism is evil):

function Swap-Table {
# contents of swap-table.1.0.ps1 go here

}

It has been particularly useful for comparing ClientPolicies and ConferencingPolicies in Lync, as ClientPolicy has over 70 attributes! Once you have the function in your session and you’re connected to Lync Management Shell, it works like this:

Get-CsClientPolicy | Swap-Table | Out-GridView

Deploying Deskphones for Lync? You Want This Switch!

Test-CsPhoneBootstrap said that we were doing the right thing. Jeff Schertz’s guide to configuring Lync for Lync Phone Edition devices said that we were doing the right thing. Elan Shudnow’s post about Cisco switches and PIN authentication said we were doing the right thing.

But our Lync Phone Edition devices just were NOT authenticating.

One of my network guys mirrored one of the wall ports for me, and I alternated between the happily-authenticating AudioCodes 420HD and the stubborn Polycom CX3000, capturing WireShark traces I could barely read (I’ve since gotten to know the handshake process those LPE devices need way better than I ever wanted to). But what a pain.

msxfaq.de, long-time Exchange and now Lync MVP Frank Carius’ mostly-German variety shop of Lync and Exchange experience to the rescue – most specifically, his page on port mirroring. He recommends the NetGear ProSafe Plus series, the least expensive of which is the 5-port, non-PoE (Power over Ethernet) version, the GS105E. If you can read German, he explains several other options, along with exploring how you might connect to it without using Adobe AIR (and Windows) and some possible security implications of it having a default, hardcoded password to a web interface (theoretically, someone could break in and set up mirroring). If you can’t read German, it’s still good for screenshots of how to set up the port mirroring on the ProSafe Plus switches.

I, on the other hand, found the GS108PE, with 4 PoE ports and 4 regular ones to be the right balance between cost and convenience. This means up to four phones plugged in at once, and without power adapters. The non-PoE (and less expensive) versions will require you to use the phones’ power adapters. Later, if you want VLANs, VLANs you can have.

Port Mirroring Screenshot

Port Mirroring made easy – too bad about the Adobe AIR interface…

As for the secret web interface that is a vulnerability on the GS105E, it does not appear to be present on the GS108PEv2 I have. Also, the management software does not spot the switch when I search from a computer on another subnet. I repeated Frank Carius’ experiment with the firmware for the GS108PEv2, version 1.00.12, downloaded from NetGear support, and did not turn up any hard coded credentials. Out of curiosity, I did a quick skim of the GS105PE’s latest firmware (1.3.0.1, dated June 2014), and did not spot any obvious hard-coded credentials, but did see a lot of HTML JavaScript, hinting at a web interface…

Because of this possible security issue with the hard-coded usernames and passwords, I recommend the GS108PE or GS105PE instead of the GS105E.

Make sure that you don’t get one of the non-“Plus” versions – they’re somewhat less expensive, but don’t have the mirroring available. I made this mistake, initially getting the GS108P.

So, if the model number ends in P, it does not have the smarts required to configure mirroring; if it ends in just E, it doesn’t have PoE (and might contain some risky firmware), and if it ends in PE, it does it all and will make your phone evaluation and troubleshooting easier.

As for WireShark, DHCP, certificates, certificate chains, comparing multiple phones at once and how we finally got Lync Phone Edition to work right with our ancient Cisco ACE load balancers, that’s another post. Or three.

PowerShell Summit Europe 2014 – Registration Ends September 10!

PowerShell, in my not quite humble opinion, is the best thing Microsoft has done in the past decade, with the second best being Lync. It is mere coincidence that my professional life currently revolves around both.

So, if you are a cheap flight away from Amsterdam, you have until September 10 to join several PowerShell legends like Jeffrey Snover, Don Jones, Richard Siddaway, Tobias Weltner and Steve Murawski, as well as about 60 regular PowerShell enthusiasts like me for three intense days at a price far lower than a regular Microsoft course at some random training center.

The summit starts September 29 (Monday) and goes through October 1, with most people arriving (and hanging out) Sunday evening.

Registration and more summit info: http://powershell.org/wp/community-events/summit/

 

August 2014 Lync Server 2013 CU Solves Address Book Delta Issue… Eventually.

The Address Book delta issue that we and so many other large Lync environments have experienced is resolved by the Lync Server 2013 Cumulative update for August 2014 – about 30 days after the update is installed. So, patience, Grasshoppers.

What should have been the early July 2014 Lync Server 2013 Cumulative Update became the mid August 2014 update, but we’re not complaining – it beats the early July update followed closely by the August update to fix the damage done by the July update!

Installing the Cumulative Update – and Waiting

First, a warning passed on to me by the MS Support engineer who handled our original incident: after you connect to each of your Lync Front End servers and before you install this update, make sure that Event Viewer and Performance Monitor are not open. To be safe, close all MMC windows, including the “Server Manager” one that comes up by default when you log on to Windows Server 2008 or 2012. If you’ve read the release notes (which you should have!), you’d have seen this, but I’m stating it again, as the MS Support engineer thought it important enough to contact me directly.

Our installation of the update went smoothly, taking less than half an hour per server, including the time to stop all Lync services before and for them to start again after. We only had one server out at a time; the pool was still running.

Now, back to the whole “eventually” part. You will not see a “good” delta the first day you run an address book update, no matter how many times you enter “Update-CsAddressBook”. Continue reading

Lync Server 2013 Address Book Deltas Not Being Produced – but a Fix is Coming!

UPDATE: The Cumulative Update is here, we installed it, it worked – read all about it!

We’re starting to deploy Enterprise Voice (Lync as a telephone replacement), and it’s been going reasonably well, except for one little thing that’s become a huge headache: Lync’s address book was not getting updated.

 

First attempted solution: Change Address Book full download frequency (“KeepDuration”) from 30 days to 1 day in Set-CsAddressBookConfiguration

Result: No change
Why: It would take the clients at least 30 days before they would start downloading full copies. In our case, that was probably a good thing (see “Third attempted solution”). Greig Sheridan gives a more detailed explanation of why this totally won’t work and some other things you shouldn’t bother trying.

 

Second attempted solution: Switch AddressBookAvailability in Client Policy for everyone to “WebSearchOnly”

Result: Worked like a champ for Lync 2010 and Lync 2013 clients, no change for Office Communicator 2007 R2 clients (the vast majority of non-Enterprise Voice users)
Why: Office Communicator 2007 R2 doesn’t have the ability to do Addess Book Web Queries (ABWQ). This is incorrectly stated in the main TechNet article about new features in Lync Server 2010, but correct in a similar table in this TechNet Magazine article. Grrr…

Additionally, Lync Server ClientPolicy settings do not have any effect on Office Communicator 2007 R2 clients. Those changes have to be made in the client’s Registry or via Group Policy. Since Office Communicator 2007 R2 cannot do Address Book Web Queries, there is no such Registry entry present or possible.

 

Third attempted solution (DO NOT DO THIS): Raise the MaxDeltaFileSizePercentage from the default 20 to 100.

Result:This will produce a delta file and keep it available no matter how large it is. Sites around the world that do not have superb MPLS (WAN) connections had to block all traffic from the Lync pool on Port 443 until we removed the deltas.
Why: all the OCS (and Lync) clients started downloading the deltas as soon as the users logged into them.
Why we thought this might be a good idea: Listed as a solution in the TechNet Forums thread LYNC 2013 generating Full Address book files but not the delta ones
Why it might possibly be a good idea for you: None of your sites are bandwidth-constrained, or all of your users are on their own connections
Why it is most likely not a good idea for you: The delta files can be twice the size of the “full” address book. Multiply that by the number of users at each site, and decide if that is a bandwidth hit you can take in the hour after everyone gets to work.

 

Fourth attempted solution: Made use of one of the Software Assurance (SA) support incidents we got as a Volume Licensing customer

Result: It is a known error in the Lync Server software itself, and a fix is expected in the next Cumulative Update, expected later in June or in July 2014. Hooray! We even get the incident refunded, because Microsoft agrees that it was a problem with the product itself. Keep an eye on the Lync Cumulative Updates

After loads of testing on both our side and theirs, we figured out that ABServer always tries to create deltas each day between the newest full address book (F-xxxx.lsabs) and each of the previous 29 full address book files, but, depending on the MaxDeltaFileSizePercentage, stops making the .lsabs.tmp file and deletes it. For us, the magic “keep” threshold appears to be 52; there is no difference in the delta file size, though.

There are various reports around the web of this being a problem in environments where there are more than 50,000 address book entries – environments like ours. Using ABServer -DumpFile to crack open the C-xxxx-xxxx.lsabs files created (and retained) when we had the MaxDeltaFileSizePercentage set above 52, we discovered that the first part of the file was a complete copy of the address book, while the last part was a long list of “deleted” entries. The GUIDS on the “deleted” entries list matched GUIDS in the “new” entries. The Microsoft Support engineer I’ve been working with described this as a problem comparing entries for large address books.

 

Morals of the story

  • The testing environment has to mirror your production one in data size and scope. This is a problem that does not exist in smaller environments, and, according to Microsoft Support, doesn’t affect Lync Server Standard Edition.
  • Whenever you change anything about Lync’s configuration, think about bandwidth.
  • Don’t believe everything you read on the Internet. Even on TechNet’s official product pages…
  • If you are a Volume Licensing customer (which you probably are if you’re running Lync Server on premises), make use of your Software Assurance incidents. If you don’t know whether you have any, get the person who signs your Volume Licensing agreement to sign into the Volume Licensing Center and grant someone in Operations the right to manage contacts for Software Assurance. Don’t let those incidents go to waste – your company has already paid for them!
  • Just because something is a “known defect,” it doesn’t mean that we have to live with it forever or until the next major version. Be persistent; Microsoft listens sometimes :)