Real-time audio/transcript per user extraction and realtime audio sent to the meeting

Xcelyst · January 24, 2025, 4:08pm

We are exploring the possibility of accessing real-time audio data from ongoing Zoom meetings using APIs. While Zoom provides cloud recording functionality, we don’t have the access of real time audio.

Our specific requirements are as follows:

Real-Time Audio Access: We need to capture the audio data from ongoing meetings in real time for processing purposes.
Real-Time Response Delivery: After processing the audio, we generate a response in audio format which needs to be sent back to the meeting. This response should be spoken within the meeting through a bot.

Kindly help us without including any third party api solution.

Thank you !

freelancer.nak · January 24, 2025, 4:30pm

Hi Xcelyst,

Thanks for reaching out!

To achieve real-time audio access and response delivery within Zoom meetings without third-party solutions, you’ll need to consider the following approaches:

1. Capturing Real-Time Audio (Options Available):

To access real-time audio from Zoom meetings, you have multiple options:

Building a Meeting SDK Bot (Recommended):
- Using the Native Linux Meeting SDK is highly recommended, as it allows you to capture audio per participant and also send audio back to the meeting.
- This approach provides the most control and flexibility for real-time processing.
Live Streaming Option:
- You can live stream the meeting audio to a custom RTMP server for processing.
- However, this method provides a mixed audio stream rather than per-participant audio.
Real-Time Messaging (RTMs) API:
- While Zoom has RTMs capabilities, real-time audio features are currently not live, making this option unavailable for now.

2. Sending Back Processed Audio (Options Available):

Once you’ve processed the audio, sending it back to the meeting requires one of the following approaches:

Native Meeting SDK (Preferred Option):
- This allows seamless integration and transmission of processed audio per participant back into the meeting.
- It ensures precise audio control for better user experience.
Zoom In-Client (Embedded App):
- You can build an embedded app within the Zoom client to inject processed audio.
- However, this approach requires you to handle your own audio transmission (e.g., translated audio based on end-user preferences).

Let me know if you need further details or guidance in setting up any of these solutions.

Best,
Naeem Ahmed

Topic		Replies	Views
Zoom live transcript endpoint Meetings	1	575	November 28, 2022
Reading Meeting Transcription in real time API and Webhooks api	2	381	November 15, 2024
Stream meeting to extract transcripts in real time API and Webhooks	8	2483	August 21, 2020
Is there any API or any other way we can use to transcript audio from zoom call in real-time so that we can use it in our backend for further processes ofcourse with the consent of the user Feature Requests meeting-sdk	1	626	October 8, 2022
How to get the conversation time of the person speaking and real-time transcription data API and Webhooks api	4	424	March 27, 2024

Real-time audio/transcript per user extraction and realtime audio sent to the meeting

1. Capturing Real-Time Audio (Options Available):

2. Sending Back Processed Audio (Options Available):

Related topics