Lengths of recorded radio and video are not the same, and out of sync while merging them together

chriswuyiming · February 29, 2024, 8:29pm

Hello,

I have successfully implemented the zoom sdk headless bot for linux using the sample code, that allows me to join meeting, record and store radio and video into yuv and pcm raw data format respectively.

However, I found the lengths of video and radio are not the same(see attached)

Here is the ffmpeg command I used to merge radio and video:
ffmpeg -f rawvideo -pix_fmt yuv420p -s:v 640x360 -r 25 -i meeting-video.yuv -f s16le -ar 32000 -ac 1 -i meeting-audio.pcm -c:v copy -c:a pcm_s16le -map 0:v:0 -map 1:a:0 video-audio-output.mkv

input video size: 640x360
input frame rate: 25 fps
audio sample rate: 32k
channel: 1

Possible Problem: The writing rates are not the same, see attached it is uneven while recording audio and video. Or Varibale frame rate.

Do you know how to fix this issue, or what caused this situation.

chunsiong.zoom · March 1, 2024, 6:06am

@chriswuyiming these demos shows the capabilities of the SDK and how to access the raw audio and raw video stream. For instance in this case, how to get access to the video YUV420p frame and PCM audio.

There are further optimisation which needs to be done, and they are currently not in this demo application.

I did some quick testing, and it seems that if the video frame is encoded into mkv using ffmpeg at runtime, you should not have this issue of different length of video & audio.

It is also necessary to do slight offset of the audio and video, as the starting time of saving the audio and video file might differ at runtime level.

chriswuyiming · March 1, 2024, 8:18pm

Thanks for replying. Encoding in real-time while recording is definitly one of the methods to solve this issue, both FFmpeg and Gstreamer are capable of handling real-time encoding tasks. I am wondering what ffmpeg command did you use at runtime, did you encode the video frame into mkv from yuv, or you just skip yuv format(directly from data to mkv)?

amanda-recallai · March 2, 2024, 7:49am

@chriswuyiming, we’ve seen GStreamer be more effective when dealing with more complicated real-time audio/video pipelines, especially those that require dynamic reconfiguration at runtime.

You’d be able to accomplish this with GStreamer with a pipeline containing two appsrc to ingest the raw audio and video, videorate and videoscale to normalize the framerate and size of the incoming video, audiorate to normalize the sample rate of the audio and x264enc and voaacenc to encode the video (h264) and audio (aac) respectively, followed by an mp4mux and a filesink to mux the audio and video and write it to a file.

If you want to capture the screenshare as well, or have multiple participants video showing at the same time, you’ll have a significant amount of additional complexity as you’ll need to modify the pipeline dynamically while it runs, in order to add or remove the required pipeline elements.

Another alternative is to use Recall.ai for your meeting bots instead. It’s a simple 3rd party API that lets you use meeting bots to get raw audio/video from meetings without you needing to spend months to build, scale and maintain these bots.

Let me know if you have any questions!

chunsiong.zoom · March 4, 2024, 3:05am

@chriswuyiming ,

I’ve tried using ffmpeg to encode and then save to mkv file at runtime. This is not using the commandline ffmpeg but the c++ libraries.

benren · April 26, 2024, 1:06pm

@chriswuyiming did you figure out a way to merge the raw audio and video or compensate for the difference in lengths?

Topic		Replies	Views
Video Stream is 2-3x faster then the audio stream Meeting SDK recording	9	200	April 26, 2024
Working with the Zoom RAW video data .yuv file Linux recording , video-sdk	6	359	April 17, 2024
Zoom Meeting SDK Headless Bot in Linux - Audio starts ahead of time than video Meeting SDK recording , api , video-sdk	2	164	March 27, 2024
Raw Video Sending Getting Cropped and Frame Drops/Slow Rate Meeting SDK	0	48	April 20, 2024
Raw Video not coming in sample app for linux sdk Meeting SDK	2	248	January 5, 2024

Lengths of recorded radio and video are not the same, and out of sync while merging them together

Related Topics