Hey @Venkat_Koushik, you could be seeing A/V drift and speed swings for several reasons:
- Early frames are using wall time or callback order instead of SDK timestamps
- Video pacing isn’t tied to a master clock, so the first seconds can run ~1.5× before settling
- PTS origin varies per stream, so audio and video start at different zeros and then drift
There are a few things you could try to sync A/V better though:
- Align all timings from SDK timestamps, not arrival time or system clock, using
AudioRawData::GetTimeStamp
and the Linux raw data callbacks - Buffer ~150–300 ms per stream as a jitter buffer, pick audio as master, normalize each track to t=0, and pace the video by drop/dup to your target FPS
- If capture rate drifts, resample audio to the master clock and keep video aligned; Zoom’s guidance is to timestamp each audio and video frame and ensure they are played back in sync
On how to auto-start recording for internal or external meetings, you can get a meeting’s join token for local recording to have your bot automatically start recording after it enters the call; this generally works for meetings owned by the authenticated user/app.
If you’d rather not build and maintain the buffering and sync layer, teams often use Recall.ai’s meeting bot API to pull real-time Zoom audio, video, and transcripts and offload multi-participant timing and layout orchestration