I want to apply my AI based solution to find problems occurs while on call

I app creating a web based AI solution. For that i need to get real time data of audio for transcription such that i can use that data . So is it possible to get real time transcript such that i can use that data to implement my solution.

Hi, @bajrang.wappnet,

Thank you for posting in the Zoom Developer Forum! Just to make sure I understand correctly, which SDK are you using for your web application? Are you using the Meeting SDK or the Video SDK?

I see that you’re interested in getting real-time audio data. Currently, the Video Web SDK doesn’t have direct access to Video sessions. However, many developers have found success by using the Native Video SDK alongside the Web SDK. This combination allows them to have direct access to raw data, including audio and video.

Alternatively, you can also consider building a bot that joins the session and records or extracts the real-time audio data. That way, you’ll have the audio data you need for your application.

Let me know if you have any further questions!

Hi, @donte.zoom ,
Thank you for your feedback.

Are you using the Meeting SDK or the Video SDK?

I am currently using Video Web SDK.

However, many developers have found success by using the Native Video SDK alongside the Web SDK. 

Can you please provide me some details about this ways. It will be very helpful.

Alternatively, you can also consider building a bot that joins the session and records or extracts the real-time audio data. 

Does Zoom provide this service or we need to use third party to integrate this with my website where I can get transcript of audio.

Looking forward for your response.
Thanks.

Yes, basically you would leverage the one Native SDK to get direct access to raw data. Then, you can extract the raw data, and process it with your AI to implement your solution. Down below I’ve linked our support documentation on raw data for some of our native platforms.

Alternatively, you can also consider building a bot that joins the session and records or extracts the real-time audio data. 

Yes, a meeting bot is an application that connects to a meeting and accesses the meeting’s audio and video content, or an app that streams media into a meeting. In your case, if you want real-time transcripts, you can build a bot that joins the session, records, and processes the data in real time.

Alternatively, you can use the Web SDK to leverage live transcription and translation. With this feature, you can receive speech as JSON objects in real time using the Zoom live transcription and translation feature. This feature can also translate speech from one language to text in another language in real time. For more information, please refer to our documentation on this topic:

Live transcription and translation

Let me know if this helps !

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.