Hello, I am working on a multimodal learning analytics system and I need to capture real-time video and audio of each participant in the zoom meeting. Then I will send this data to an external server where it will be analyzed and different metrics of verbal and non-verbal communication of the participants will be obtained.
My question is:
Is this possible? And if possible, where should I start?
Thank you very much for your help