What I want to do:
I want to analyze the audio of video calls. Specifically, I want to transcribe audio into text and perform natural language processing stuff on it.
Also, I want to be able to know when & how much each person is talking.
Can I do what I want to do using Video SDK?
- get row audio
- perform analytics such as transcription on the audio
- know when & how much each person in the call is talking
Details as to why I decided to ask this question
I got concerned when I read the comparison of Video SDKs and Client SDKs (https://marketplace.zoom.us/docs/sdk/video/compare) saying the following:
“Raw Video / Audio Data Available in Android, iOS, Windows, macOS”
Does this mean I do NOT have access to the raw video/audio data when using Web SDK?
It looks like I have access to the “stream” but doesn’t that mean I have the access to the audio?
I am not an expert in web development. Kind answers without the assumption of understanding of the advanced topics in the web development world would be very helpful.
Thank you so much in advance!!