Fetching Individual video and audio stream from a zoom call

I would like to fetch individual live video and audio streams(10-15 mins) from the zoom meeting and do some ML object recognition on it.
Is it possible to get the audio/video streams directly from the client side(frontend app) or will I have to setup a server/ application for fetching data from zoom server via SDK?
What kind of preprocessing will I have to do after getting the streams?
Will the stream data which I recieve in a ready to use format?

@vimal.pillai you will typically need a Video SDK running on Client Platform, and Client here refers to Windows, macOS or Linux.

From the Video SDK, you will subscribe to raw audio and raw video stream, convert them to your preferred format and use them with your ML models.

Hi @chunsiong.zoom,

Thank you for the quick response.
Sorry for not clearly explaining initially. Let me clarify. I was hoping to build a mobile application that has group video call as one of its features and we were hoping to use zoom SDK for providing this call service. We want to run some AI moderation service on this group video call for user safety. Can I subscribe to the stream directly using zoom SDK? Will I still have to run processing for media conversion or does zoom SDK provide any such features directly?

@vimal.pillai you can, but it is not recommended to do so on a mobile phone.

The mobile phone will first need to convert raw PCM and raw YUV420 to ML model usable format, and thereafter run the model. You will need to do the conversion yourself.

You can always try it, but it might be too heavy for mobile CPU/GPU/APU for real time processing.

Yes mobile SDK has raw data access callbacks.