I’ve been working on getting the live stream data from a Zoom meeting and it works quite well to get the data sent over RTMP to my media server.
But as I’m rather new to the whole streaming thing I’m getting a bit confused as to exactly what data is sent and in which format it is in. I’m only interested in the audio data, so the video data is simply dropped. Now I’d like to know the following:
- What is the audio codec? (I believe it’s AAC but just want to make sure)
- What is the sample rate? (44.1 kHz?)
- How many channels are sent? (I’m guessing only one, where all participants’ audio is in? If that is the case, is there maybe an easy way to figure out who is talking at what time, e.g. with the active speaker API?)
- Is there some kind of inherent latency? I notice an offset of a few seconds, but I’m not sure if it’s my setup.
Thanks a lot for a great developer documentation!