Hi,
I am thing about making a zoom robot that can process the raw data of video and audio and then output something on the screen in real time. I was looking at Video SDK and Meeting SDK, but I am still confused which is more recommended concerning my requirements. My goal is not to create a new desktop/application but an embedded extension/robots in exsiting Zooms. Also, what platforms are supported to acquire the raw data?
Thanks!
@huanyefantasy2000, you would need to use the Meeting SDK to do this.
At a high level, you would need to:
- Spin up a server. We recommend AWS, GCP, or Digital Ocean.
- Use the Linux Zoom Meeting SDK to launch an instance of the Zoom client.
- Once you have the Zoom SDK launched, and use the Raw Data functionality to extract the video and audio streams.
- This will return the video in I420 raw frames and audio in PCM 16LE raw format, so you’ll need to encode the audio and video yourself afterwards.
- Once you have one instance of this working, you’ll need to scale this across several servers if you want to run multiple bots simultaneously, which is required to have bots for multiple meetings.
Here is a resource on this: Meeting Bots Accessing Media Streams
Another alternative is to use the Recall.ai API for your meeting bots instead. They power meeting bots for 400+ companies. It’s basically an API for meeting bots.
Let me know if you have any questions!
Thanks for your reply. I have the following questions:
- I’ve observed that you support various platforms. Does it mean that I can incorporate such robots in them without creating an isolated software?
- May I know what is the effect / demo of the recall.ai if I want to have real-time video and audio stream to be processed in our cutting-edged AI analysis and then show the results in the meeting (Panel or in-screen)?
- I’ve noticed that you will invite a guest as the robots? Is that true? Will such robots raise the privacy concerns of the attendees by having one more people?
- Can I use Python for develop? Also, according to your api documentation, the websocket only provides 2 fps video, which is not suffcient for my application. Can you justify that?
- What is the price for using your API? How can I distribute the bots first under development mode?
Thanks!
@huanyefantasy2000, I got your email and answered there, but I’ll also answer some of your questions here so the community has visibility.
To answer your questions:
-
Yes, you will be able to have the robots join Zoom, Google Meet, Microsoft Teams, and other video conferencing platforms with one API integration. Here is our full list of supported platforms: Meeting Platforms
-
Here are the docs on:
a. Getting real-time video: Receive Real-Time Video: Websockets
b. Getting real-time audio: Receive Real-Time Audio
c. Streaming media back into the meeting if you want the media to displayed on the bot: Stream Audio/Video from Webpage to a Meeting
-
The robots can be guests or logged in, you can configure that in the API. There should be no privacy concerns, and you can make the robots also announce a message in the meeting chat if you’d like.
-
Yes!
-
We have a pay-as-you-go plan. You can sign up here! Get Started
Let me know if this answers your questions.