Zoom Calls Quality issue - Thresholds from moniotoring


#1

We are trying to setup alerting for bad zoom calls.

What should be the recommended threshold values for a good call quality :-

  1. Jitter
  2. Avg Packet Loss
  3. Latency
  4. Bitrate

Jitter - ~125-200ms
Latency - ~300-400ms
Packet Loss - ~20%
CPU Usage - ~90%”

When the above values are configured then we get an email every 1 minute because the meeting are happening over the Internet, Home Internet, 3G/4G.

When the jitter and Latency was set to 1000ms then the email alert was on an average every 5 minutes.

Considering the global userbase i.e. 898 active users globally there are always going to be some calls in some part of the world which will be affected and we will always get an alert.

There should be some finite values we can specify in order to capture a real bad quality call ?


#2

Howdy @suhailpuri

I was able to find this Zoom Help Center Article, which states:

You can [go] to http://speedtest.net to check your network bandwidth. Generally, we recommend 1.2 to 1.5Mbps upload/download for an optimal experience on desktop or room systems.

However, you mentioned a variety of connection sources such as: Internet, Home Internet, and 3G/4G, and I wasn’t able to locate any specifics for what your thresholds should be set as while monitoring to be considered.

I asked some of our Customer Success Managers if we had “Optimal Operating Target” values for these metrics, and they provided me with the following information:

| Metrics     | Ideal Threshold | Notes                                                |
|-------------|-----------------|------------------------------------------------------|
| Jitter      | < 150ms         | Variation in time between packets arriving           |
| Latency     | < 300ms         | Delay between packets being sent/received            |
| Packet Loss | < 20%           | Number of packets failing to reach final destination |
| CPU Usage   | < 90%           | Send and Receive rate you experience during the call |

Are you developing an app and using the List Meeting Participant’s QoS API to obtain these numbers for your own monitoring?

I scratched the itch with a couple other questions and wanted to share what I learned, in case it helps…

What happens when one or more attendees are on poor-quality connections, when this occurs, will they trigger a threshold monitoring alert?

Answer: Users receive a monitor alert, and they are encouraged to disable their camera.

Does this help get you moving in the right direction?


#3

Hello @bdeanindy,

Yes we are pooling QOS API stats via splunk for alerting so we know what user is impacted and what could be the possible reason for his or her meeting wrong.

below is the query we use in splunk :-

earliest=-5m@m index=connect2018 source="/opt/splunk/bin/scripts/ZoomLiveMeetingQuality.py"
| rename participants{}.user_id AS userID, participants{}.user_name AS userName, participants{}.user_qos{}.date_time AS DateTime
| stats
max(participants{}.user_qos{}.cpu_usage.zoom_avg_cpu_usage) AS AvgCPU,
max(participants{}.user_qos{}.audio_input.latency) AS AudioInLatency,
max(participants{}.user_qos{}.audio_output.latency) AS AudioOutLatency,
max(participants{}.user_qos{}.video_input.latency) AS VideoInLatency,
max(participants{}.user_qos{}.video_output.latency) AS VideoOutLatency,
max(participants{}.user_qos{}.audio_input.jitter) AS AudioInJitter,
max(participants{}.user_qos{}.audio_output.jitter) AS AudioOutJitter,
max(participants{}.user_qos{}.video_input.jitter) AS VideoInJitter,
max(participants{}.user_qos{}.video_output.jitter) AS VideoOutJitter,
max(participants{}.user_qos{}.video_input.max_loss) AS VideoInMaxLoss,
max(participants{}.user_qos{}.video_output.max_loss) AS VideoOutMaxLoss,
max(participants{}.user_qos{}.audio_input.max_loss) AS AudioInMaxLoss,
max(participants{}.user_qos{}.audio_output.max_loss) AS AudioOutMaxLoss,
max(participants{}.user_qos{}.audio_input.avg_loss) AS AudioInAvgLoss,
max(participants{}.user_qos{}.audio_output.avg_loss) AS AudioOutAvgLoss,
max(participants{}.user_qos{}.video_input.avg_loss) AS VideoInAvgLoss,
max(participants{}.user_qos{}.video_output.avg_loss) AS VideoOutAvgLoss
by meetingId, userName
| table meetingId, userName, AvgCPU, Audio*,Video*
| convert rmunit(Audio*)
| convert rmunit(Video*)
| convert rmunit(AvgCPU)
| where (VideoInLatency>1500 OR VideoOutLatency>1500) AND (AudioInLatency>1500 OR AudioOutLatency>1500) AND (AudioOutJitter>1500 OR AudioInJitter>1500) AND (VidoeOutJitter>1500 OR VideoInJitter>1500) OR (AudioOutAvgLoss>20 OR AudioInAvgLoss>20) AND (VideoOutAvgLoss>20 OR VideoInAvgLoss>20)


#4

Do you need any additional information/help on this?


#5

Yes please. We opened a case with Zoom and zoom support later opened a case with engineering team who asked me to post a question in developer forum.

Our issue has still not been addressed and your input would be valuable.

Thanks in advance,

Suhail


#6

Hey @suhailpuri

You asked…

I provided the table above which contains what Zoom considers the “Optimal Thresholds”.

You also stated this about your use of Splunk…

Yes we are pooling QOS API stats via splunk for alerting so we know what user is impacted and what could be the possible reason for his or her meeting wrong.

I’m not a Splunk expert, and I’m unfamiliar with its DSL, but the where clause at the bottom of your filter appears to be calculating the VideoOutAvgLoss as a value of 20 instead of 20%. Are you certain the values you’re testing against coincide with the threshold values/units I provided in the optimization table earlier?

What other specific issues/questions do you need us to address for you please?