Is there any transcription API that I can use to get the closed captioning for development within the Zoom app?
Hi @alexpandian7ks
Thanks for reaching out to us and welcome to our community!
I am happy to help here!
Unfortunately, we do not have a transcription API available at the time
@alexpandian7ks, like @elisa.zoom mentioned, there isn’t a transcription API available at the moment.
There is a workaround though - you can capture the audio data, and run that through a third-party transcription provider on your end.
Here are 4 other ways you could explore to create a real-time transcript from a Zoom meeting.
1. Use the Zoom RTMP live-streaming API
Pros:
- Doesn’t require any 3rd party services
- Lighter weight than building and running a Zoom bot
Cons:
- Needs to initiated on a per-meeting basis
- You need to set up an RTMP server to receive the data, which requires engineering effort to deploy, scale, and monitor
- Participants can get spooked by the “live” badge that appears in the meeting (even if it’s a privte meeting)
- No speaker separation
2. Build a desktop app to capture users’ computer audio
Pros:
- One of the most cost effective solutions
Cons:
- You need to build a separate app for Windows, Mac and Linux
- It is especially difficult to tap into computer audio on Mac
- App runs on users’ computer so it can slow their computer down/make computer fans go off
- No speaker separation
3. Build a Zoom bot
Pros:
- Can get the separate audio streams per participant for perfect diarization / speaker labels
Cons:
- It is very heavy-weight as you would need to spin up multiple servers to run the Zoom client for the bot
- Running infrastructure for Zoom bot costs more than live streaming.
- You need to encode the raw video and audio yourself
4. Use Recall.ai
It’s a unified API that lets you send meeting bots to video conferencing platforms to capture the audio,
video and transcription in real-time.
Pros:
- Handles spinning up the servers, and providing the real-time raw audio/transcript so all you interact with is a simple API.
- Works on any Zoom plan (including Free)
- Gets speaker diarization / speaker labels
- Works agnostic of meeting platform
Cons:
- It’s another 3rd party service in your stack
Let me know if you have any questions!
This topic was automatically closed 24 hours after the last reply. New replies are no longer allowed.