Audio stream access from Zoom's SDK

miki · October 31, 2023, 10:38am

Hello, I’m trying to build a live transcription service with Zoom SDK, and I’m unsure about the best approach.

My first thought was to use the Video SDK to incorporate meetings into a web application since it offers access to the audio stream. However, it seems this access is only available on native platforms.

Does this mean that I must use the Video SDK within a Zoom App, that would run inside of the native Zoom client?

chunsiong.zoom · October 31, 2023, 1:10pm

@miki there are some assumptions I’ll be putting down here.

You are trying to create a Zoom App which helps to do live transcription during a Zoom Meeting?

You will probably need 2 components, a Zoom App and a Zoom Meeting SDK (Bot)

If that is the case, you will probably need a Zoom Meeting SDK (Bot) running on Linux / Windows which join the meeting and once it is in the meeting it will

listening to the audio stream
sending the audio stream to a remote server or processing the audio stream locally
sending the transcribed text to your Zoom App, via web service or web sockets.

amanda-recallai · November 1, 2023, 6:21am

@miki, there are 4 main ways to get the live audio stream from Zoom.

1. Use the Zoom RTMP live-streaming API

Pros:

Doesn’t require any 3rd party services
Lighter weight than building and running a Zoom bot

Cons:

Needs to initiated on a per-meeting basis
You need to set up an RTMP server to receive the data, which requires engineering effort to deploy, scale, and monitor
Participants can get spooked by the “live” badge that appears in the meeting (even if it’s a privte meeting)
No speaker separation

2. Build a desktop app to capture users’ computer audio

Pros:

One of the most cost effective solutions since audio processing can be run on-device.

Cons:

You need to build a separate app for Windows, Mac and Linux
App runs on users’ computer so it can slow their computer down/make computer fans go off
No speaker separation

3. Build a Zoom bot

Pros:

Can get the separate audio streams per participant for perfect diarization / speaker labels

Cons:

It is very heavy-weight as you would need to spin up multiple servers to run the Zoom client for the bot
Running infrastructure for Zoom bot costs more than live streaming.
You need to encode the raw video and audio yourself

4. Use Recall.ai

It’s a unified API that lets you send meeting bots to video conferencing platforms to capture the audio,
video and transcription in real-time.

Pros:

Handles spinning up the servers, and providing the real-time raw audio/transcript so all you interact with is a simple API.
Works on any Zoom plan (including Free)
Gets speaker diarization / speaker labels
Works agnostic of meeting platform

Cons:

It’s another 3rd party service in your stack

Let me know if you have any questions!

bratin.mallick · January 2, 2024, 8:28am

Hi Amanda, thank you for your reply. Can you please elaborate on the third point mentioned?

What kind of a bot is this? Web based/server based?
Is it built using the libraries provided by zoom or making calls to any zoom APIs?
How does it access the audio stream?

Thanks in advance.

vasss · January 22, 2024, 3:41am

Hi @miki ,

I’m in the initial stages of planning a Zoom app primarily for a web interface. The app aims to access live streaming audio and participant details, sending this data to our server via REST or GraphQL API. We plan to use AI tools for generating meeting summaries to automate client business requirements.

Considering our focus on a web app initially, would you recommend starting with the REST API or SDK? If an SDK is preferable, which one would be best for developing a web interface MVP?

I appreciate any suggestions and guidance you can provide as we embark on this project.

Anand VM

system · January 24, 2025, 7:17pm

This topic was automatically closed 368 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Best approach to get streaming Meeting Audio (or streaming transcript) Zoom Apps meeting-sdk , web-meeting-sdk , video-sdk	4	3066	October 29, 2022
How to get live audio stream from Zoom Meeting API and Webhooks	2	2336	January 21, 2021
Help on How to Start Building Live Transcription App within Zoom API and Webhooks live-streaming , recording , api	2	644	July 16, 2025
Live meeting audio / transcription API and Webhooks	2	2243	October 18, 2020
Is it possible to start a livestream to custom livestream service programmatically? Zoom Apps	2	365	January 24, 2023

Audio stream access from Zoom's SDK

1. Use the Zoom RTMP live-streaming API

2. Build a desktop app to capture users’ computer audio

3. Build a Zoom bot

4. Use Recall.ai

Related topics