Description
We’ve been observing an uptick delays for receiving recording.started webhooks. This unfortunately causes a large degradation in our app’s UX and our users have started to complain.
Here is the data I have around delays since the beginning of June:
Updated with data from over the weekend. Saturdays and Sundays show no delays, so it appears to be a load issue. Is expanding capacity to process these queues on the roadmap?
Just to substantiate this a little but. This is also something we’ve observed with other events (meeting started & ended, participants joined & left). It’s also affecting us as we’re counting on the webhook to take prompt action on an integration. Most of the time it works close to real time, but there are periods of time where it they take several minutes to find their way to us.
I’m attaching graphs for the past 3 days of the number of seconds that events take to find their way to us. I’m happy to try to gather any other data that could help.
-> it looks like the forum only lets me attach one image so I’m only attaching 2020-08-05.
Sorry for my late reply, I need to tweak my notifications for this forum.
X is the time of the day, Y is the delay in seconds between the event showing up on the webhook and the time is actually occurred (yes it can take several minutes).
The events in question are: meeting.participant_joined, meeting.participant_left, meeting.ended, meeting.started, meeting.participant_jbh_joined, and meeting.participant_jbh_waiting.
I did not observe any difference in delay between these events, as far as I can tell, when a delay exists, it exists for all of them indiscriminately.
I can provide raw data if this would be helpful, I just thought I’d add my voice substantiated by visual proof
It does appear the backup has now cleared. We started noticing the delays around 14:50 UTC. Started clearing up around 17:50 UTC. We’d really appreciate any info we can pass on to our rightfully unhappy users.
I’ve got meeting UUID 4bur3VZ/QRC7rq09vMBeAQ== with a participant.left event which occurred at 2020-08-13T13:29:39 but showed up at: 2020-08-13 13:43:26.
Here are 2 more UUIDs which recently were affected around that same time, I can also point to the exact event and timing of reception if that’s helpful.
1jv+bj9oQjq8eRURyGhd5Q==
tclHkENKQTS6mujFQQulxA==
@tommy thanks, I’ll investigate why that may be on my end. It’s definitely pretty curious the processing I do it super lightweight and the server handles much more than the webhook call without such response times. I’ll start keeping track of timing between the request’s arrival and the time it finished processing, thank you for pointing to that.
And thank you for getting to the bottom of it on your end :). Are you saying that the server capacity has been increased already? If not do you have an idea of what the timing of this will be? I’m curious to see if I can confirm the effects in the graphs.
Thanks! I love your position and how proactive you are on the forums threads. It’s refreshing to deal with knowledgeable people who can actually enact change. Hats off to Zoom & you for this.