Live Captioning?

dmyers · August 29, 2022, 6:42pm

Is there a way to get captions in near-real-time so my app can know what is being said during a meeting?

dmyers · September 1, 2022, 7:57pm

I’ve had a hard time getting responses to questions on this forum. It’s kind of frustrating. Why bother having one if you’re not going to monitor it and help people out?

gianni.zoom · September 6, 2022, 10:28pm

Hi @dmyers ,

Thank you for your patience and feedback! We are working to improve response times with respect to other Developer Advocate responsibilities.

You can you use the caption API token to get captions in near-real-time for a 3rd party app.

Check out our Postman workspace for the endpoint.

Does this help?

dmyers · September 7, 2022, 1:57pm

Thank you for the response. I’m looking for Zoom to provide my app with text representing what is being said during the meeting. WebEx has this available. Does Zoom?

amanda-recallai · September 17, 2022, 3:02am

@dmyers There are 4 ways you can stream the real-time transcription from Zoom to a 3rd party app.

1. Use the Zoom live-streaming API

Pros:

Doesn’t require any 3rd party services
Lighter weight than building and running a Zoom bot

Cons:

Needs to initiated on a per-meeting basis
You need to set up an RTMP server to receive the data, which requires engineering effort to deploy, scale, and monitor
Participants can get spooked by the “live” badge that appears in the meeting, depending on the use case
No speaker separation

2. Build a desktop app

Pros:

Will work agnostic of meeting platform
Very simple to build

Cons:

No speaker diarization, only one audio stream
Runs on user’s computer so any processing slow their computer down and drain their battery
Recording video is especially resource intensive on user computers
Requires user to install software, which some may be hesitant to

3. Build a Zoom bot

Pros:

Can get the separate audio streams per participant for perfect diarization / speaker labels
Doesn’t spook participants

Cons:

It is very heavy-weight as you would need to spin up multiple servers to run the Zoom client for the bot.
Running infrastructure for Zoom bot costs more than live streaming

4. Use Recall.ai

It’s a unified API that lets you send meeting bots to video conferencing platforms (like Zoom) to capture the audio and video in real-time.

Pros:

We handle the spinning up the servers, and piping the audio to transcription providers so all you interact with is a simple API.
Gets near-perfect diarization / speaker labels
Supports video capture
Works agnostic of meeting platform

Cons:

It’s another service in your stack

system · September 29, 2022, 4:43am

This topic was automatically closed after 30 days. New replies are no longer allowed.

Topic		Replies	Views
How to get live transcription during a meeing API and Webhooks	2	3818	October 22, 2023
Streaming closed captions produced by zoom to third party app via API API and Webhooks	2	999	September 20, 2023
Get Zoom Transcription API and Webhooks recording	3	597	October 15, 2023
Can we get from Zoom API real time speech to text transcription of the meeting? API and Webhooks	4	1607	September 13, 2024
Zoom live transcript endpoint Meetings	1	495	November 28, 2022

Live Captioning?

1. Use the Zoom live-streaming API

2. Build a desktop app

3. Build a Zoom bot

4. Use Recall.ai

Related Topics