Hey @purnam-admin, happy to help!
The approach we would recommend here, as @donte.zoom mentioned, is to use GStreamer as manually implementing audio and video synchronisation can be complex.
To give a specific example of how you’d accomplish this, you’d want a pipeline similar to the following (in textual format, which can be executed on the command line with gst-launch-1.0)
uridecodebin name=dec uri=file:///vid.mp4 ! \
videoconvert ! videoscale ! videorate ! video/x-raw,format=I420,width=1280,height=720,framerate=(fraction)30/1 ! appsink name=vidsink sync=true \
dec. ! audioconvert ! audioresample ! audio/x-raw,format=s16le,rate=16000,channels=1 ! appsink name=audsink sync=true
The uridecodebin
demuxes the MP4 into audio and video streams, and we use a videoconvert
, videoscale
, videorate
to convert the video to the format that the Zoom SDK expects. We do the same with the audio, and convert it using audioconvert
and audioresample
.
We terminate both branches of the pipeline at an appsink
which is a GStreamer element that allows you to extract media from the GStreamer pipeline into your application.
In your application, you’d attach to the new-sample
signal emitted by the appsink,
and call audio_sender->send()
or video_sender->sendVideoFrame()
when new audio or video media is available.
In this arrangement, the GStreamer pipeline maintains an internal clock and handles the synchronisation of the data reaching the appsink
s, which drives the callbacks to the audio sender and video sender.
Alternate Solution
If you don’t want to deal with managing all of this yourself, an alternative is to use the Recall.ai API for your meeting bots instead.
It’s a simple 3rd party API that lets you use meeting bots to send raw audio/video into meetings without you needing to spend months to build, scale and maintain these bots.
Let me know if you have any questions!