How to get millisecond-accurate recording timings from Video SDK?

I am new to Video SDK and doing a tech spike in preparation for a migration away from Twilio Programmable Video (EOL Dec 2024).

We run a SaaS platform for communication analysis and require individual audio recordings for each participant, which are synchronized to at best <100ms, otherwise our analysis is messed up. We achieved this with Twilio via the compositions feature which guarantees synchronization of all streams of all participants.

However, in the Zoom Recordings API:

The recording_start, recording_end values in examples are all accurate to the second, not the millisecond. This means when we cross-compare points in time across two participant recordings there are potentially up to 999ms out of sync.

How can Zoom Video SDK provide us with millisecond level timestamps on individual participant audio?

@brendan.hill ,if you are looking at high precision, it might be worth considering using Raw Audio / Raw Video access using Linux SDK

I see Linux SDK options for Meetings SDK, but we are using Video SDK.

Our SaaS product will embed VideoSDK in the web application and we retrieve recordings afterwards (requirnig millisecond level accurate across independent audio streams) - can Linux SDK help us with this use case?

Hey @brendan.hill ,

Yes, here is our Video SDK for Linux: Zoom Video SDK for Linux

Here is a tutorial from @chunsiong.zoom : How to get raw streaming video and audio from Video SDK web sessions using Linux