Need help on Zoom meeting bot

yaskivartur0830 · November 30, 2024, 8:29am

Hi, zoom talents.
I am working on zoom meeting bot which can join meeting and transcript it per user.
I used GitHub - zoom/meetingsdk-headless-linux-sample: A demo on creating a headless meeting bot using the Zoom Meeting SDK for Linux and Docker as boilerplate.
At first, I got mixed audio and sent it to google stt api. This works for contents.
But I want to add names for every text of users in transcription like:
Yaskiv: Hi, friend, how are you doing?
Mycola: I am good, what about you?
…

Give me advice technically in detail.
Thanks

chunsiong.zoom · December 2, 2024, 5:35am

@yaskivartur0830 you should use the one way audio, where individual’s audio are seperately returned to you as raw audio

amanda-recallai · December 2, 2024, 11:11pm

Hey @yaskivartur0830,

It sounds like you’re generating a transcript already but are interested in diarizing it. We run meeting bots at scale to record/transcribe video conferences and so this is definitely something we are very familiar with - hopefully I can provide some guidance here!

Option 1: Linux SDK speaker changes

The key here is that you need to know:

Which participant ID is speaking at any given time
What the underlying name of a given participant ID is

Since you’re using the Linux SDK already, you’ll likely want to look into the onActiveSpeakerVideoUserChanged() callback, which will tell you when the active speaker changes.

Then, you can use the IMeetingParticipantsController’s GetUserByUserID method to get the underlying user’s info including their display name. Once you have this, you’ll be able to map a given transcript utterance to their corresponding speaker label.

Option 2: Recall.ai

Another option is to use Recall.ai. It’s a simple 3rd party API that lets you use meeting bots to get raw audio/video, diarized transcriptions, and metadata from meetings in just a few lines of code.

Let me know if you have questions!

noah.duncan · December 2, 2024, 11:29pm

Hi @yaskivartur0830

I’m working on an open source API for creating Zoom Bots called Attendee. The source code shows how to transcribe meeting audio and add the speaker names. It uses the one-way audio streams instead of mixed audio.

Some relevant parts of the codebase:

Getting info about a speaker using the Meeting SDK, so you can add the speaker name: attendee/bots/zoom_bot/zoom_bot.py at main · noah-duncan/attendee · GitHub
Subscribing to the one-way audio stream from Zoom: attendee/bots/zoom_bot/zoom_bot.py at main · noah-duncan/attendee · GitHub

Please let me know if you have any questions

Topic		Replies	Views
Need help in extracting transcripts from the audio file Meeting SDK	24	701	April 8, 2024
Zoom API for transcripts Meetings api	6	3826	July 23, 2025
Zoom live transcript endpoint Meetings	1	575	November 28, 2022
Zoom Meeting SDK get live transcription of each user during meeting. JavaScript/Electron.js Meeting SDK webhooks , recording , api , video-sdk	3	350	October 18, 2024
How to get the conversation time of the person speaking and real-time transcription data API and Webhooks api	4	424	March 27, 2024

Need help on Zoom meeting bot

Option 1: Linux SDK speaker changes

Option 2: Recall.ai

Related topics