Hello, I’m trying to build a live transcription service with Zoom SDK, and I’m unsure about the best approach.
My first thought was to use the Video SDK to incorporate meetings into a web application since it offers access to the audio stream. However, it seems this access is only available on native platforms.
Does this mean that I must use the Video SDK within a Zoom App, that would run inside of the native Zoom client?
@miki there are some assumptions I’ll be putting down here.
You are trying to create a Zoom App which helps to do live transcription during a Zoom Meeting?
You will probably need 2 components, a Zoom App and a Zoom Meeting SDK (Bot)
If that is the case, you will probably need a Zoom Meeting SDK (Bot) running on Linux / Windows which join the meeting and once it is in the meeting it will
listening to the audio stream
sending the audio stream to a remote server or processing the audio stream locally
sending the transcribed text to your Zoom App, via web service or web sockets.
I’m in the initial stages of planning a Zoom app primarily for a web interface. The app aims to access live streaming audio and participant details, sending this data to our server via REST or GraphQL API. We plan to use AI tools for generating meeting summaries to automate client business requirements.
Considering our focus on a web app initially, would you recommend starting with the REST API or SDK? If an SDK is preferable, which one would be best for developing a web interface MVP?
I appreciate any suggestions and guidance you can provide as we embark on this project.