If unified communications (UC) troubleshooting was easy, everyone would be successful at it. But with the right tools and training, you can effectively combat your unified communications issues. This blog will identify where you should start, the simple step model for all UC problem solving and finish off with a few more advanced tips on troubleshooting.
Where to Start Troubleshooting Unified Communications
- Check your infrastructure.
- Check the call history logs.
- Check real-time stats on calls (if available).
- Check network – especially QoS setting.
- Check endpoints firmware and registration.
By covering the above bases you've likely identified the haystack to look into, but now let's look at the how to find the needle in that haystack.
A Model for Unified Communications Problem Solving
This simple, yet effective, model isn't rocket science. It is easy for anyone to follow to ensure they are troubleshooting in a logical and methodical fashion.
Inspect the problem and draft a succinct problem statement. Note symptoms and likely root causes.
Collect the data points and logs required to help isolate potential causes.
Analyze possible root causes based on the data points and facts you collected.
Design an action plan based on the causes. Start with the most probable cause and create a plan where you can test one variable.
Deploy the action plan; implement each step carefully while testing to see whether the symptom goes away.
Scrutinize the results to determine whether the problem has been resolved. If resolved, accept that the process is finished.
Devise an action plan based on the next most likely cause on your list. Return to step 4 and repeat the process until solved.
Getting into Advanced Unified Communications Troubleshooting Territory
So, what should you be on the lookout for when troubleshooting UC? There are 3 things that you should always be thinking about:
Servers Running Hot
This might sound trivial, overutilization of UC devices can cause performance issues in your application. Check the infrastructure metrics such as CPU, memory, services (if applicable) for your monitored devices. Setting up proactive alerting to let you know via email or SNMP trap (or notification directly into your Ticketing application) when infrastructure metrics are about to hit critical is key for you to proactively monitor your environment.
All major vendors allow you to report on voice quality metrics from Avaya, Cisco and Microsoft to SBC vendors like Oracle, AudioCodes and Sonus. Being able to historically search and review troublesome calls is critical for finding out the root cause of voice quality issues. Searching by user or extension in a performance management tool like Prognosis allows you to view calls a user or extension made and diagnose by looking at degradation factions like packet loss, jitter, latency. If you have an SBC in the environment, you can also view the call path from end to end with VQ360 where call quality from UC systems and SBC are stitched together.
Some vendor also support streaming live Voice quality metrics for calls. This is useful to see real time performance statistics of calls.
Use a network troubleshooting feature set like IR Path Insight:
a. If you're using Avaya or Skype for Business, you can view the network hops that a call has taken. The Network Hop visual diagram in a tool like Prognosis can show the latency between each hop. You can click on these router/switches in the diagrams to get plain English root cause analysis of that network interface. For Cisco, even though network hop data is not reported by the vendor, you can still go into the network troubleshooting link in Prognosis Web UI and review problematic interfaces which are checked on an ongoing basis.
b. If you're having voice quality issues, it is always a good idea to check if QoS (DSCP) is enabled on your network. This indicates whether the voice is being prioritized correctly over other traffic. Path Insight comes built in with a synthetic call simulator that allows you to run a test to indicate exactly which router or switch is stripping or not configured for QoS.
Prognosis UC assessor network assessment tool can also check if your environment is ready for Voice or Video by running synthetic transactions in the environment. These assessments also have the capability to detect if QoS marking are enabled on your network even before you implement your UC application.
Last but certainly not least here a few tips I find can be applied elsewhere too…
Cross off the basics
Firmware version and updates are usually ignored . However, they can affect voice quality and performance. Cross off checking firmware and updates of phones and devices you are using. This is the easiest base to cover when you are troubleshooting.
Have you ever been reviewing analytics or pulling reports and something completely off topic catches your attention from the corner of your eye? My advice: stay curious! Just like those glasses you may misplaced, sometimes the issue is staring right at you and usually located in the most obvious locations you may not have checked.
Just because a UC or Contact Center system is up, does not mean it can pass calls through. It is always good, as a piece of mind, to have regular outside in testing to ensure your systems can be reached from the outside by your customer or external parties calling in. This could be your IVRs, or your external facing number. A regular test call from the outside can test that your UC or Contact center system is ready to pass the call through to relevant individuals. IR offer a server like this called Heartbeat (or WebBeat for web traffic).
If your business has heavy call volumes during certain seasons, it is also a good idea to do Stress or Load testing on your environment prior to that event. This is to ensure you environment can handle the expected call volume from a busy season like Black Friday etc.