Zoom Recordings and Transcripts. How to handle transcripts and what does each file is?


#1

We use the zoom_recordings api to pull the zoom recordings. Now we have transcripts enabled in our account. And since then, new transcript files have been returned from the api. Four json files for each meeting, but the api does not return the file type. And it’s not clear why the transcription is broken down in four different files. 

Any body pulling transcripts using the api can give a feedback on that and how to use the api to identify which of those files is the raw transcript?


#2

It’s not 4 json files. It’s actually 3 json files and a .vtt file. The .vtt file is the transcript. You cannot find which of the 4 files is the .vtt directly from the API but you can just download them all and then pick the .vtt file.

The .json files hold different data about the transcript on a timeline. There is a .transcript.local.json, -transcript.json and a .json file. I’m not sure what they all do but I think the .json lists the current speaker at different timestamps.


#3

@Paragon, we will consider to improve this API to return the corresponding file type, so you can identify transcript files.


#4

I’m also experiencing this issue. Seems to be no reason not to include the file type to simplify distinction between the vtt file and the json files.


#5

Right now it’s the only blank file_type, but in our code we’ve written this as file_type=="" | file_type==“VTT” to make the code resilient to the assumed new value for this field.  HTH