Audio streaming from Zoom Phone to 3rd party app in real time

I’m seeking a method for a third party AI application to listen to / receive the live audio stream from Zoom Phone in real time. Often referred to as media/audio forking, I’m seeking the same type of integration that is often used by 3rd party call recording software and also used by AI-based live agent assist / co-pilot platforms. With other voice platforms, we do this via API, web socket, SIPREC, or embedding a conference call into the voice platforms call flow which bridges the 3rd party software into the call.

I’ve come across posts that mention the Video SDK, Media SDK, and Recall AI, but it’s unclear to me if any of these will work with Zoom “Phone” and if they will meet my requirements.

Please advise on options.