Obtaining Zoom video in real time for object detection

austinmw · August 23, 2022, 3:29am

Hi, I’d like to build a patient monitoring interface on top of Zoom by performing object detection of people in real time. In order to do this I need to get access to the raw realtime video data, and I found this discouraging post from 2020: Access to streaming data for object detection

Is it still true that this is unavailable or is there a way to do this nowadays?

Thanks

elisa.zoom · August 29, 2022, 7:17pm

Hi @austinmw
Thanks for reaching out to the Zoom Developer Forum, I am happy to help here!
Unfortunately, this is still not available.
Cheers,
Elisa

amanda-recallai · September 17, 2022, 10:30pm

@austinmw, this is definitely possible. There are 3 common ways you could access the real-time raw video data from Zoom.

1. Use the Zoom live-streaming API

Pros:

Doesn’t require any 3rd party services
Lighter weight than building and running a Zoom bot

Cons:

Needs to initiated on a per-meeting basis
You need to set up an RTMP server to receive the data, which requires engineering effort to deploy, scale, and monitor
Participants can get spooked by the “live” badge that appears in the meeting, depending on the use case
Can’t get separate video stream per participant
No speaker diarization

2. Build a Zoom bot

Pros:

Can get the separate audio streams per participant for perfect diarization / speaker labels
Can get separate video streams per participant
Doesn’t spook participants

Cons:

It is extremely heavy-weight as you would need to spin up multiple servers to run the Zoom client for the bot
Running infrastructure for Zoom bot costs more than live streaming.
You need to encode the raw video and audio yourself

3. Use Recall.ai

It’s a unified API that lets you send meeting bots to video conferencing platforms to capture the audio and video in real-time.

Pros:

Handles spinning up the servers, and providing the real-time raw video/audio so all you interact with is a simple API
Can get separate video streams per participant
Can get the separate audio streams per participant for perfect diarization / speaker labels
Works agnostic of meeting platform

Cons:

It’s another 3rd party service in your stack

Topic		Replies	Views
Getting real-time (or recorded) video of each participant in a meeting API and Webhooks recording , api , raw-data	7	10508	November 6, 2023
Can we get from Zoom API real time speech to text transcription of the meeting? API and Webhooks	3	1177	September 11, 2023
How to get live transcription during a meeing API and Webhooks	2	2633	October 22, 2023
Access live transcript during a meeting through an API Meetings	2	609	January 5, 2023
Zoom live transcript endpoint Meetings	1	417	November 28, 2022

Obtaining Zoom video in real time for object detection

1. Use the Zoom live-streaming API

2. Build a Zoom bot

3. Use Recall.ai

Related Topics