Though the stated resolution in that other thread is:
This is resolved for us now. The issue was infact that we weren’t including the intermediate certificate chain. My only guess is that zoom started checking for this last month as it was previously communicating use the cert. This site was helpful in testing our end of the connection: https://whatsmychaincert.com/
However, our cert chain is correct so I don’t think this is the cause for our current issue.
We also checked and our certificate chain is fine.
This is hardly a certificate issue on our sides.
If that were the case, all webhook events would be missing.
Also, it wouldn’t explain why it starts working again after an OAuth reconnect.
Could some of your webhook-sending worker nodes be having an issue with a certificate?
This could potentially explain the non-consistent behavior with this (as maybe reconnecting via OAuth assigns another webhook node with a working certificate - although it would be strange, why it would stop working in a day or a week).
Update: we heard from someone in Zoom engineering that this issue isn’t happening to other customers (at least not broadly), so that it is likely on our end.
I don’t know exactly how it is happening on our end, but I decided to try putting Cloudflare in front of our site so that Zoom webhook requests would be dealing with SSL certs on their edge nodes instead of certs in our system.
So far this seems to be working. It’s too early to tell for sure that the issue is fixed, but we haven’t received any errors since switching.
Seems to still be working. We don’t see any new errors in the /webhook_logs endpoint since putting Cloudflare in place.
I still don’t understand what was up with our certs, though. They were Let’s Encrypt certs, so they should have been accepted with no problem. And I have no clue why some requests would get a cert error and some wouldn’t, since every request should have been served the same cert. And I don’t think this could have been a problem with specific servers within our VPC since there was no indication in our logs that the Zoom webhook requests were getting past the SSL handshake, which should have a single point of entry, so it’s not like some subset of servers would be serving an invalid cert. And we weren’t getting these errors with end user requests or any webhook requests from multiple other third parties.
But since it’s fixed now we’re going to stop digging, so hopefully the solution of putting some proxy in front of the site can work for anyone else encountering this problem, if it’s an option.
Thank you @cedric-swivvel
We did prepare everything, but need to change the URL of the Webhook, since we can’t transfer our root domain to Cloudflare. This requires the app to be submitted again, and last time this happened, all out customer’s webhooks got invalidated, so we’re a bit scared to go through that again.
We’re also using Let’s Encrypt certificates on Heroku.
There were reports with their certificates before as well:
@elisa.zoom can you please confirm, that updating the Webhook URL (to the one on CloudFlare) and submitting the app will not disconnect all connected account’s webhooks?
this is not what I’m asking. There was a serious bug earlier this year on Zoom side, when we last had to submit our app for your TDD review, and it broke all our customer’s webhooks, so they all had to reconnect.
I’m asking if this was fixed so that we can try resubmitting our app (without upsetting all our users) with a new URL that is proxied through CouldFlare?
Hello, also experiencing a similar issue- We don’t receive any events after first connecting with oauth, but for some users it works fine after disconnecting then reconnecting. When testing with newly created zoom accounts we do not receive events even after reconnecting.