Can we get from Zoom API real time speech to text transcription of the meeting?
1 Like
Hey @mukhayyo.tashpulatov
Unfortunately you cant right now.
Cheers,
Elisa
@mukhayyo.tashpulatov, there are 3 ways you could get the real-time transcription from Zoom:
1. Use the Zoom live-streaming API, and feed audio stream to transcription provider
Pros:
- Doesn’t require any 3rd party services
- Lighter weight than building and running a Zoom bot
Cons:
- Needs to initiated on a per-meeting basis
- You need to set up an RTMP server to receive the data, which requires engineering effort to deploy, scale, and monitor
- Participants can get spooked by the “live” badge that appears in the meeting, depending on the use case
- No speaker separation
2. Build a Zoom bot, and feed audio stream to transcription provider
Pros:
- Can get the separate audio streams per participant for perfect diarization / speaker labels
- Doesn’t spook participants
Cons:
- It is very heavy-weight as you would need to spin up multiple servers to run the Zoom client for the bot
- Running infrastructure for Zoom bot costs more than live streaming.
- You need to encode the raw video and audio yourself
3. Use Recall.ai
It’s a unified API that lets you send meeting bots to video conferencing platforms to capture the audio,
video and transcription in real-time.
Pros:
- Handles spinning up the servers, and providing the real-time raw audio/transcript so all you interact with is a simple API.
- Gets speaker diarization / speaker labels
- Works agnostic of meeting platform
Cons:
- It’s another 3rd party service in your stack
Let me know if you have any questions!
Hello Amanda,
I’m looking for your prices as I want to use Deepgram. Could you contact me at astowny(at)gmail.com ?
Thanks.
Tony
This topic was automatically closed 368 days after the last reply. New replies are no longer allowed.