Difference between audio transcript and closed captions

Description/Error
The meeting recording APIs show that a meeting can contain closed captions files (type CC) and audio transcript files (type audio_transcript).

Both are reported to be VTT files.

What is difference between the two?
Who or what creates the audio transcript? (human created/machine created).

Hey @j.herman,

Here is the difference between Audio Transcript and Closed Captions:

Closed Captions:

Audio Transcript:

Let me know if you need further explanation!

Thanks,
Tommy

Hi Tommy,

I read those articles before I posted this question. They do not provide enough information to answer my questions.

The audio transcript article states “The transcript is divided into sections, each with a timestamp that shows how far into the recording that portion of the text was recorded.”

Closed captions are also divided into sections, each with a timestamp that shows how far into the recording that portion of the text was recorded.

Here is an example caption taken from a caption VTT file from a Zoom cloud recording:

00:00:10.000 --> 00:00:16.000
Here is a caption

For closed captions, the captions are created by meeting participants through the Zoom clients or a 3rd party captioning service. Nowhere in the audio transcript article does it mention how it is created, only that users get an email saying their transcript is ready after the cloud recording is ready.

I am not an admin of a Business, Education, or Enterprise account, so I can’t test to determine the difference between an audio transcript file and a closed captions file. As far as I know, the two files contain the same data.

Does this help clarify what information I’m looking for?

Jonathan

Hey @j.herman,

Audio Transcript

The audio transcript is a VTT file created automatically if you have these account/user settings turned on:

Here is an example of an audio transcript file.

Closed Caption

The closed caption file is a .txt file that someone in the meeting or 3rd party app types/generates with time stamp and the words spoken:

18:57:45 Hello
18:57:50 This is a closed caption
18:58:03 I am doing this manually, but there are tools to do this automatically

Closed captions can also be configured in the account/user settings

Let me know if this answers your question :slight_smile:

Thanks,
Tommy