How to access meeting transcript from recording.completed webhook event

Hello,

In the recording.completed webhook response, there is an attribute called file_type , with some possible values listed as “TRANSCRIPT”, “CC”, and “TIMELINE”.

However, whenever I receive a recording.completed event from a meeting on my Zoom account, the only available file_types are “MP4” and “M4A”.

I am trying to retrieve a Zoom meeting transcript for each recording that contains both speaker names and timestamps using the recording.completed webhook.

How can I achieve this? If it is not possible to achieve this with the recording.completed webhook, what other method should I follow? Also, what settings need to be turned on within a user’s Zoom account in order for us to be able to access their meeting transcripts through the Zoom API?

Thanks,
Farhana

Hi @fasarker ,

You may not have transcription enabled. Please check out this support article:

Additionally, this blog walks through recording downloads with webhooks, but links to the cloud recording API endpoints as an alternative at the beginning:

@gianni.zoom Thank you for your quick response. I enabled transcription by following the instructions in the resource you shared. Now, I am able to see an additional “TIMELINE” file_type in the recording webhook response, but I still do not see a “TRANSCRIPT” or “CC” file_type.

Are there any other steps I need to take to retrieve the full recording transcript through the recording.completed webhook?

Also, could you point me to some documentation that describes what the “TIMELINE” file contains?

Thanks again for the help.

@fasarker can you check if you see the transcription provided in the cloud recording on the web portal or API endpoint? I want to see if it’s actually transcribing, but not showing in the webhook. If it’s not showing there, please double check transcription for cloud recordings was properly enabled.

I just tested the recording.completed webhook and got a similar response as you. I received MP4, M4A and TIMELINE files.

Alternatively, I also subscribed to the recording.transcript_completed event and received the TRANSCRIPT file there:

I will submit a request to our docs team for clarification on which file types are to be expected from recording.completed. In the interim, please also subscribe to recording.transcript_completed event.

The cloud recording API endpoints have robust descriptions of the fields: Meeting API

Another thing I noticed @fasarker ,

In the recording.completed documentation, audio_transcript is not listed in the allowed:

This means what we saw is likely expected behavior. closed_captions however, is listed. Can you confirm you had closed captions enabled in the recording you tested?

@gianni.zoom Thanks again for the very helpful response.

Yes, it looks like the transcript is available when I query the get recordings API. I suppose then that the transcript will show up in the get recordings API response only when it is ready?

When I enable closed captions during the meeting, the CC file does indeed show up in the recording.completed webhook response. However, the CC file does not contain any speaker names. Other than the speaker names being omitted, is there any other difference between the contents of the CC file and the contents of the audio transcript file?

I checked the cloud recording APIs, but I still don’t see a good description of what is contained in the TIMELINE file. What does each timestamp in that file represent (e.g. when a word is spoken, when a certain length of phrase is spoken, etc.)? How should we match up the timestamps in the TIMELINE file to the CC file?

I will try out the recording.transcript_completed event. Can we be certain that the recording.transcript_completed event will always fire after the recording.completed event has already fired? How long after the recording is available can we be sure that the transcript will be available?

Thanks again for all the help.

-Farhana

Hey @fasarker !

Yes, you can always use this endpoint for the transcript files.

See below for comparison:

See below for comparison:

In my testing, the recording.transcript_completed event was available 7 milliseconds before the recording.completed event. Webhook latency/failure os subject to a few different factors, but we have some support guidance on the topic here: Zoom Developer Blog

Thanks again for the help @gianni.zoom, that answers most of my questions.

I’m still confused about what each timestamp in the TIMELINE file represents (e.g. when a word is spoken, when a certain length of phrase is spoken, etc.), but I’m happy to open a separate post for that.

Appreciate your time!