Do I have access to raw audio data using Web Video SDK?

What I want to do:

I want to analyze the audio of video calls. Specifically, I want to transcribe audio into text and perform natural language processing stuff on it.
Also, I want to be able to know when & how much each person is talking.


Can I do what I want to do using Video SDK?

  • get row audio
  • perform analytics such as transcription on the audio
  • know when & how much each person in the call is talking

Details as to why I decided to ask this question

I got concerned when I read the comparison of Video SDKs and Client SDKs ( saying the following:

“Raw Video / Audio Data Available in Android, iOS, Windows, macOS”

Does this mean I do NOT have access to the raw video/audio data when using Web SDK?

It looks like I have access to the “stream” but doesn’t that mean I have the access to the audio?

I am not an expert in web development. Kind answers without the assumption of understanding of the advanced topics in the web development world would be very helpful.

Thank you so much in advance!!

Hey @hide , happy to help! :slight_smile:

The audio on Web is a single audio stream. It is not raw in the sense that you can modify the raw audio stream. Does that clear things up?


Hi @tommy thanks for your reply!

So by row I actually meant real-time audio data. Meaning that I want to do something with the audio data, during the meeting, real-time. (just like transcribe the conversation real-time).

Can I do that with Web SDK? Or is it only possible with Desktop SDK? Thanks!

(Also, if it is only possible with Desktop SDK, can I do it with Angular/React for frontend/ django for backend, not objective-C or swift?. Asking because the example for macOS SDK only come with objective-c and swift. Thanks!)

Hey @hide ,

It cannot be one with the Web Video SDK.

As for the Desktop SDKs, please ask in #desktop-video-sdk :slight_smile:


This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.