This is for developer-specific feature requests. For other requests please contact our customer support team.
Is your feature request related to a problem? Please describe.
I want some feature by which I can transcript real-time zoom call audio to ise it in my backend with the consent of the user.
Describe the solution you’d like
If I can het an API or access to zoom’s audio of call then I will be able to perform the rest.
Describe alternatives you’ve considered
A clear and concise description of any alternative solutions or features you’ve considered.
Add any other context or screenshots about the feature request here.
@muskan , there are a few ways you can create a real-time transcription from Zoom. Here are the top 3 most common ways:
1. Use the Zoom live-streaming API
- Doesn’t require any 3rd party services
- Lighter weight than building and running a Zoom bot
- Needs to be initiated by the end-user every meeting
- You need to set up an RTMP server to receive the data, which requires engineering effort to deploy, scale, and monitor
- Participants can get spooked by the “live” badge that appears in the meeting
- No speaker separation
2. Build a Zoom bot
- Can get the separate audio streams per participant for perfect diarization / speaker labels
- Doesn’t spook participants
- It is very heavy-weight as you would need to spin up multiple servers to run the Zoom client for the bot
- Running infrastructure for Zoom bot costs more than live streaming.
- You need to encode the raw video and audio yourself
3. Use Recall.ai
It’s a unified API that lets you send meeting bots to video conferencing platforms to capture the audio and video in real-time.
- Handles spinning up the servers, and providing the real-time raw audio so all you interact with is a simple API.
- Gets speaker diarization / speaker labels
- Works agnostic of meeting platform
- It’s another 3rd party service in your stack
Let me know if you have any questions!