Receiving audio stream from meeting

Zoom team,

   I’m building an app for one of my clients who wants to use zoom platform. The app need to connect to a zoom meeting and receive the audio stream. The app then interfaces with a real time transcription platform and get the meeting transcribed.

   we have parsed out the phone number and meeting id from the invite and had the app join the meeting via PSTN (through Twilio API). It works great but the PSTN audio is lossy and so the transcription is not even close to accurate.

  The other option we are thinking of is to put a SIP stack in the app and get the app join the meeting via SIP and receive a wideband audio. 

   Is this possible with your platform ?

   Thanks for your response.



    We do support SIP trunking and connecting to zoom meeting via room connector but not line side SIP which is what you might be looking for. Let me explain it in detail since we hear this requirement quite a bit and a lot of people gets this mixed up.  I assume that your client is developing an app so that it can be distributed and monetized to anyone using Zoom who needs your service.

  • You will not get lose-less audio regardless of what options you are looking at. We do not support linear audio straight out and so you are going deal with some lossiness.

option 1:  SIP trunking : Zoom provides Elastic SIP trunks which is an add-on to your account. That means, there is a SIP trunk that connects your zoom account to your IP PBX or voice gateway. You will provide the dial in numbers and your meeting participants will connect to your IP PBX first and then to Zoom telephony gateways where your audio gets mixed.  This is typically an option that works for large enterprises and not something that is going to work for you as a third party app. Our gateway can send G.711 or G.729 streams which is not what you might be looking for when interfacing with speech manipulations or transcriptions

option 2:  Connecting through our room connector:  A pre-requisite for this this option is that your end customers should have the room connector plan which is an add-on. I will explain how it works but beware that this will consume one room connector port (or a license) for each connection and also restrict the usage to your customer who have this plan.  You will get the room connector IP address/meeting id, construct a SIP URI (or get the URI string from the meeting  invite if you are parsing it) and send the INVITE up to our platform. You can negotiate G.722 which is a wide band codec,  set the m-line in the video SDP to turn off video, set the audio m-line in the SDP to receive only. That is, you are asking our platform to not send video and only receive audio. This will work but your solution will be expensive and restrictive.

Hope this helps.  If you are going to pursue option-2 (unlikely :)) and need help, let us know.



Thanks much. We will probably stay with connecting via PSTN. 

Can you go into more detail about connecting via SIP URI, using an audio-only device? I have a CRC and I’m trying to dial in to meetings from audio-only polycom conference phones without using PSTN ( to leverage G.722)

Can you tell if it’s possible to connect to a live meeting from a node.js script? I have the same problem as I need to receive live audio stream of the meeting to do real time transcription for my app.

If it is, which technology shall be used for that?

Thank you very much for your response. 

1 Like