Understanding Lync Video Quality Reports
In any Lync Server environment, be it in a test lab or a production deployment, the Monitoring Server role and its accompanying reports should be considered a key component. With improvements in Lync 2013 this role is even easier to deploy than in past versions, providing for a wide array of data captured by endpoints during various communications modalities. Leveraging the User Activity Reports is one of the best ways to review this detailed information to then make informed decisions on how to plan for bandwidth for video conferencing.
This article does not cover the actual deployment and configuration of these component. This is one of the easier components of Lync to deploy and is covered in the Deploying Monitoring section of the TechNet documentation. What this article does cover however is where to locate and read Media Quality Reports for both peer-to-peer and multiparty conference calls, focusing specifically on the video payload information.
The concepts explained here will be important to understand for an upcoming article which will explore in greater detail the various capabilities of H.264 SVC as utilized by Lync Server 2013. This new codec provides for a much larger range of resolutions than earlier codecs like Real-Time Video so extrapolating this information from the reports can be a little more complicated than in the past.
Instead of just randomly looking at reports for various past sessions it would be better to create new calls in which the activity is deliberate and known. It can be very difficult to reverse-engineer the reports to make assumptions on what the call scenario might have been, so as a learning exercise it would be prudent to work forwards instead of backwards.
The example scenario in this article will be a Lync video call placed between two Lync 2013 clients running on different Windows 8 workstations of roughly the same capabilities. To keep this simple each system is capable of encoding and sending up to 720p HD video when using SVC and each display is set to a resolution of at least 1920 pixels wide.
Test Video Calls
To provide for a few different call logs to compare make a pair of video calls between two endpoints in the following scenarios. Leave each call up for at least 5 minutes to insure that call quality details are recorded and that average bitrate numbers will show some measurable difference from peak numbers. Also make sure not manipulate the window size outside of the instructions on either client as this may pollute the data and the results will not match what is explained in this article.
Throughout these test calls it will be important to never resize the video window prior to ending the call as the video resolution recorded in the reports will not reflect the intended result. This concept will be explained later, so for now simply end the calls unobtrusively.
- Start a video call between the two workstations and leave the video window at the default, minimum size on both clients.
The image on the left is from a Lenovo T410 connected to a 24” monitor set to 1920×1200 (16:10), while the image on the right is a screenshot of a Surface Pro set at a resolution of 1920×1080 (16:9). Notice that the Lync video window, at the smallest allowed size, is slightly larger on the Surface display due to the tablet interface in Windows. This is important to understand as the size of the window which displays the video will directly impact the resolution that the client will request the other party to send.
To measure the actual pixel resolution of these two video windows each full screenshot was cropped down to just the video and the image dimensions were captured as follows:
Workstation Desktop Display Dimensions Video Window Dimensions T410 1920 x 1200 408 x 230 Surface Pro 1920 x 1080 620 x 350
- After 5 minutes or more simply end the call by hitting the hang-up button (or Ctrl+Enter) on either client.
Full Screen Video
- Start a second video call between the same two workstations and immediately change the video window to Full Screen View on both clients.
Make sure to use the Full Screen View button on the window, as opposed to the Maximize button which would still leave a window border up which doesn’t allow the video to use 100% of the screen. Depending on the resolution of the monitor this small amount could be enough to prevent the inbound video stream from moving up to the maximum possible resolution supported by the system.
- After 5 minutes or more simply end the call by hitting the hang-up button (or Ctrl+Enter) on either client without minimizing either video window. The call must end in full screen mode for the maximum resolution to be recorded.
Media Quality Reports
The call details will typically be available on the monitoring server within a minute or so after completion, so the first call should be ready to view at this point.
Review First Test Call
- Using a browser access the Lync Monitoring Server Reports Home Page and select User Activity Report under the Call Diagnostic Reports (per-user) section.
- Click the View Report button in the upper-right hand corner to search for database entries in the default time frame, which would be all activities and modalities in the past 24 hours.
- Look for the first test call listed under the Peer-to-Peer Sessions section by identifying the users, modality (e.g. Video) and time and click the Detail button.
If the first test call is not yet shown then refresh the page (e.g. F5) to reset the results and update the To: time, and then re-run the search.
Once the Peer-to-Peer Session Detail Report is loaded the page will be broken up into 4 main sections:
- Session Information – Basic information about the session like user addresses, client versions, session time, etc.
- Modalities – Will list which modalities were involved in this session (e.g. Audio, Video, Instant messaging).
- Media Quality Report – A detailed report of all media stream information which is the focus of this article.
- Diagnostic Reports – A list of diagnostic headers for specific SIP messages in the session.
By default the Media Quality Report section will be minimized and includes three major sections itself:
- Call Information – Some basic information about the caller and callee workstations and clients.
- Media Line (Main Audio) – Detailed statistics and information about the audio portion of the session.
- Media Line (Main Video) – Detailed statistics and information about the video portion of the session.
As explained earlier only the video portion of these records will be discussed, but before jumping into the video statistics it is important to review the call information first to understand the media direction and encoder or decoder capabilities.
- Expand the Media Quality Report section and review the Call Information section. The relevant items are listed below for the previous test call.
Caller Callee P-Asserted Identity (PAI) sip:email@example.com sip:firstname.lastname@example.org Uniform Resource Identifier (URI) sip:email@example.com sip:firstname.lastname@example.org Endpoint LENOVOT410 SURFACEPRO User Agent UCCAPI/15.0.4481.1000
Call Start / Duration 6/25/2013 10:55:16 AM 00:09:36 Operating System (OS) Windows 6.2.9200 SP: 0.0
Windows 6.2.9200 SP: 0.0
CPU Intel(R) Core(TM) i5 CPU
M540 @ 2.53GHz
CPU Brand Genuine Intel
Family 0x6 Model 0x3a EM64T
CPU Cores 2 2 CPU Speed 2.527 GHz 1.696 GHz
When looking at the video stream details make sure to acknowledge which endpoint was defined as the Caller versus the Callee as this is the only way to know which video stream is which. In this test call the user Lenovo laptop (email@example.com) placed the video call to the client on the Surface (firstname.lastname@example.org).
- Scroll down to the Media Line (Main Video) section and take note of the highlighted items.
The level of information in this section is sufficient to make a few assumptions about what the media path might have been. Firstly the Caller Connectivity is reported as Direct for both endpoints, meaning that the Edge Server was not required to relay media for this call. As the subnets are both the same then it would be safe to assume that both endpoints were on the same network, which they were. Additionally the Caller inside value is True for one workstation and false for the other, most likely indicating that the caller was connected to some type of third-party VPN which provided direct access to the Lync Front End pool.
Also worth noting is the Caller connection type which recorded that the caller workstation was on a wired network connection as indicated by the Ethernet string, while the other workstation is connected over Wi-Fi. This may be important when looking at bandwidth and packet loss as it is possible that the callee workstation did not have a stable connection at the time of the call.
The Transport setting is a good way to tell if the ideal option of using UDP for media was available. Often times firewalls are misconfigured and when media is established between Lync endpoints, especially when going through an Edge Server, UDP will not be available and the transport protocol will fall back to using TCP which is not preferred for real-time communications.
- Scroll further down to the Video Stream sections near the bottom of the page to find details on both of the video streams involved in the call.
Video Stream Caller > Callee Callee > Caller Codec H264 H264 Resolution 424×240 424×240 Inbound Frame Rate 14.9416 14.9451 Outbound Frame Rate 14.9780 14.9448 Frame Rate Loss 0.01% 0.00% Average Allocated Bandwidth 350 Kbps 350 Kbps Average Bit Rate 166 Kbps 159 Kbps Maximum Bit Rate 517 Kbps 470 Kbps CIF Quality Ratio 100.00% 100.00% VGA Quality Ratio 0.00% 0.00% HD Quality Ratio 0.00% 0.00%
The details above show that both video streams utilized H.264 for video (as would be expected between Lync 2013 clients) and sent the same 424×240 resolution at 15 frames per second. The average and maximum bitrate was basically the same as well. Even though the video window on the Surface was measured at 620×350 that was not large enough to trigger a step-up to the next resolution available in the H.264 SVC codec in Lync 2013.
Review Second Test Call
- Return to the User Activity Report then locate and open the record for the second test call. Expand the Media Quality Report section and then scroll down to the Video Stream sections.
Video Stream Caller > Callee Callee > Caller Codec H264 H264 Resolution 1280×720 1280×720 Inbound Frame Rate 28.2137 19.4241 Outbound Frame Rate 28.3465 19.4218 Frame Rate Loss 0.01% 0.00% Average Allocated Bandwidth 2324 Kbps 2500 Kbps Average Bit Rate 1370 Kbps 1016 Kbps Maximum Bit Rate 3221 Kbps 2745 Kbps CIF Quality Ratio 0.00% 0.00% VGA Quality Ratio 8.00% 65.00% HD Quality Ratio 90.00% 34.00%
The details above show increases in resolution, frame rates, and bit rates across the board. What is interesting about this specific call is that although both clients sent video at 720p they were not categorized with the same quality. While the caller’s outbound stream indicates HD quality almost the entire time in comparison the callee’s outbound stream was largely classified as VGA quality. This phenomenon is explained in more detail in the next section, but the lower frame rate and lower bitrate on that stream indicate a lower quality of video was transmitted given the resolution was no different.
Resolution & Quality
As was alluded to earlier, reading these reports are not always as straight-forward as it might seem. In this first test call the actual session and the recorded details are quite linear and there does not seem to be any surprises. But there are a couple parameters which need to be understood when looking at more complicated call scenarios.
Firstly the resolution reported will only ever contain a single entry, so what happens if different resolutions were sent, as can happen when the video window size is changed during the call? The resolution is reported by the client at the termination of the call or when video was last stopped, so only the last resolution used will be recorded. This is important to understand as when trying to find out what resolution was actually sent during a full-screen video call the call must be ended while the video is still in full screen. If the window is decreased from full screen view and then the call is hung up then a lower resolution may appear in the records.
Secondly the three quality ratios fields (CIF, VGA, and HD) are not directly reflective of only the resolution. These fields are a measurement of overall video quality which take into account resolution, frame rate, and any frame or packet loss. The resulting video quality is categorized and reported as one or more of the three. In the first example both video streams are defined as CIF quality for the entire duration of the call when encoded at 424×240 resolution at 15fps, even with 0% loss. Yet in the second example one 720p stream was labeled as HD quality for 90% of the call but the other 720p stream was primarily rated as VGA quality. The lower frame rate and bit rate are indicators of reasons for the decreased quality.
In most cases the following resolutions will be reported in the shown quality range, but there is currently a bug in the reporting which does not categorize 1080p video at all. This is not a comprehensive list of resolutions but just a sampling of the most commonly seen resolutions.
Resolution 1920×1080 1280×720 960×540 640×360 424×240 352×288 320×180 Quality – HD HD VGA CIF CIF CIF
In the event of reduced frame rate or during limited bandwidth scenarios the quality can be reported lower than the resolution would typically indicate. For example the following video stream details were captured from a test call in which the callee was on a Wi-Fi network with limited signal and bandwidth.
Notice that although the resolution was recorded as 1280×720 the quality was reported at 86% of the call in VGA quality and 13% in HD. Now without knowing what the actual call experience was these results could be read in one of two ways. First it could be assumed that the video window was not in full screen, but was instead only increased to a size large enough to trigger a VGA resolution for the majority of the call (e.g. 640×360) and then was increased to full screen for the remaining 13% of the call duration, based on the last reported resolution of 720p. In most cases that would be a good assumption, but as this test call was run in full screen mode the entire time then that would be false.
Instead the explanation is that based on the limited available bandwidth the client was only able to receive a less-than HD quality stream even though the resolution was scaled up to 720p. The first clue is that the frame rate is a bit below 30, and the second is that the average bit rate is well below 1Mbps and the maximum barely topped 1Mbps. For a normal 720p video call using H.264 SVC in Lync 2013 the average bitrate should be more like 1Mbps with a maximum in the area of 1.5 to 2.5Mbps.
To further illustrate this point, look what happens when even less bandwidth is available, coupled with receiving video from a workstation encoding video at only 15fps.
The frame rate has dropped even further from 25fps to 15fps and the Average Available Bandwidth is reported as not even 300Kbps. Again, a resolution of 1280×720 was encoded, but the quality of the received video was noticeably poor compared to a normal 720p session. This is reflected by the low bit rates and CIF classification (82%) of quality for the majority of the call.
Multiparty Video Calls
The examples above capture basic peer-to-peer video calls, but what happens during Lync conference calls with many clients connected and multiple concurrent video streams being transmitted? A Conference Detail Report will be formatted roughly the same, expect that multiple participants will be listed. Every participant that was in the meeting at one point in time will be listed with their own unique report, along with the modalities that client participated in during the call.
The Conference Modalities section will list a separate Media Quality report for each participant which captures the video history for each and every active stream in the conference. So the beginning of the report will look nearly identical except that the other endpoint is now a Conference URI, as all video streams are negotiated between the client and the Lync AVMCU. A single audio session is recorded as the Main Audio as the AVMCU still continues to mix all outbound participant audio streams into a single inbound stream for each client, just as in previous versions of OCS or Lync. But each Lync 2013 client is capable of receiving up to a maximum of 6 different concurrent video streams, which will be reported in individual video sections.
Listed below are the sections of a report from an example conference in which every possible capability was involved, including more than 6 active video participants and at least two Roundtable video devices.
Header Section Description Call Information Caller and Callee identification and endpoint specifications Media Line (Main Audio) Audio Stream (Caller -> Callee)
Audio Stream (Callee -> Caller)
Outbound audio stream
Inbound audio stream
Media Line (Main Video) Video Stream (Caller -> Callee)
Video Stream (Callee -> Caller)
Outbound video stream
First inbound video stream
Media Line (Panoramic Video) Video Stream (Caller -> Callee)
Video Stream (Callee -> Caller)
Outbound panoramic stream
Inbound panoramic stream
Media Line (Main Video 2) Video Stream (Callee -> Caller) Second inbound video gallery stream Media Line (Main Video 3) Video Stream (Callee -> Caller) Third inbound video gallery stream Media Line (Main Video 4) Video Stream (Callee -> Caller) Fourth inbound video gallery stream Media Line (Main Video 5) Video Stream (Callee -> Caller) Fifth inbound video gallery stream Media Line (Main Video 6) Video Stream (Callee -> Caller) Six inbound video gallery stream*
The main difference between conference and peer reports is quite evident after advancing past the Main Video section as 5 additional sections may be shown. They all may not include any actual data, depending on how many participants were actually connected to the meeting and how many had video enabled. Also the Panoramic Video section will only be included if at least one RoundTable/CX5000 device is present in the conference call, and this item can appear in either conference or peer call reports.
The Main Video 6 section is a unique case which is different from the other 5 standard video streams. As mentioned earlier the maximum number of concurrent inbound video streams supported by the Lync 2013 client is 6. Five for a fully populated gallery view plus one additional stream for the panoramic video, if applicable. So why is there a section for a seventh inbound video stream then? Apparently this stream is an ‘extra’ stream used to negotiate video for a new active speaker in the event that all the other streams are already in use, and when that new speaker’s video replaces a past speaker’s video tile in the gallery then the older stream (e.g. Main Video 2) is stopped. When yet another new speaker needs to appear on the gallery view an unused stream is still available, and so on. This approach allows a new stream to be quickly negotiated before breaking down a previous stream.
What this all means though is that the data from a given stream is just a total of all bandwidth used by that ‘slot’ for the duration of the call, so it’s impossible to know which participant was shown in which of the slots, and for how long. And as different participants may be sending different resolutions, frame rates, and qualities over even different codecs (H.264 SVC or RTV) then it can be very complicated to attempt to use this data to either calculate total bandwidth used or to plan for how much is needed. The realistic approach is to attempt to average out how much bandwidth might be used per video tile during 2, 3, 4, 5 and 6 person video calls across different screen sizes. Either way as more participants are added to the gallery each video tile will be reduced in size to make room for the new tile, which lowers the overall resolution of each individual stream in step. This approach helps to keep bandwidth utilization under control. These concepts will be covered in more detail in a future article, along with example bandwidth numbers for various video conferencing scenarios.