Unicode block character (�) appears in recording.completed webhook

tech-zoom_zp · April 22, 2023, 12:30am

Format Your New Topic as Follows:

API Endpoint(s) and/or Zoom API Event(s)
recording.completed webhook

Description
Details on your question, workflow or the problem you’re trying to solve.
Since April 17th JST, unicode block character (�) has started to appear in request body of recording.completed webhook, inside payload.object.participant_audio_files[0].file_name.

e.g. set the user’s display name to プロフィット and start a meeting and cloud recording, the file_name used to have been Audio only - プロフィッ㲁 before April 14th, but after April 17th it’s become Audio only - プロフィッ�.
The unicode block character appears in many other cases and is causing our app to skip processing this participant_audio_file and our users are not being able to use our app correctly.

Is this change done intentionally, or is it a bug?
If it’s a bug, please fix it as soon as possible.

Sample

{
    "payload": {
        "object": {
            "participant_audio_files": [
                {
                    "id": "c7f88d6f-4ab5-4338-8db9-1056e3396b3d",
                    "recording_start": "2023-04-22T00:23:12Z",
                    "recording_end": "2023-04-22T00:23:20Z",
                    "file_name": "Audio only - プロフィッ�",
                    "file_type": "M4A",
                    "file_extension": "M4A",
                    "file_size": 111758,
                    "play_url": "edited",
                    "download_url": "edited",
                    "status": "completed"
                }
            ]
        }
    },
    "event_ts": 1682123090696,
    "event": "recording.completed"
}

How To Reproduce
Steps to reproduce the behavior:
1. Request URL / Headers (without credentials or sensitive info like emails, uuid, etc.) / Body
2. Authentication method or app type
3. Any errors

Set the user’s display name to “プロフィット”
Turn on setting “Setting > Recording > Cloud recording > Record audio-only files > Record a separate audio file of each participant”
Start a meeting and cloud recording.
Wait for recording.completed webhook.
Check payload.object.participant_audio_files[0].file_name. The last character is Unicode block character.

tech-zoom_zp · April 22, 2023, 1:59am

In further investigation, it started to appear around April 16th noon JST.

tech-zoom_zp · May 20, 2023, 3:34am

I have not gotten any reply but will anyone answer if it’s a bug or not?

MultiplayerSession · May 23, 2023, 3:48pm

This definitely feels like a character encoding problem where the display name is getting truncated and then additional portions of the file name, like the file extension, are being appended, resulting in an invalid sequence of characters that results in a validation step swapping in the Unicode replacement character (U+FFFD) you’re seeing in an attempt to recover from the problem.

I don’t feel that Zoom is very transparent or accurate about character length limitations — for example, they’ll document “Max 64 chars” or “This value cannot exceed more than 12 Chinese characters.” which I doubt accurately reflects the actual limitation, or they wouldn’t be using those measurement units.

I’m guessing UTF-8 encoding is being used at some point.

Aside from getting official changes, perhaps you can ask people to use shorter display names, or express it as romaji (or at least the last few characters that are likely to be truncated) so that truncation will always result in a valid UTF-8 sequence?

tech-zoom_zp · May 24, 2023, 1:43am

Thanks for sharing your thoughts!

Aside from getting official changes, perhaps you can ask people to use shorter display names, or express it as romaji (or at least the last few characters that are likely to be truncated) so that truncation will always result in a valid UTF-8 sequence?

I hope we could do like you suggest, but in Japan, it’s not very common to use romaji and we cannot enforce our customers to use only valid alphabets. (More difficult as our customer’s customers names are often Chinese characters.)

For the time being, we have changed our app not to stop process on unicode block characters. But with 1 character less information, the file name has only 5 valid characters and it serves almost no meaning in identifying who’s audio file it is.

I hope I can get a good answer from Zoom. (Hopefully, they can extend the length or offer user_id.)

Topic		Replies	Views
Username of participant_audio_files in recording.completed webhook, is it not truncated anymore? Meetings webhooks , recording	0	220	January 16, 2024
Unicode in participant name coming wrong Webinars	4	570	March 29, 2023
Add filenames to payload of completed recordings webhook Feature Requests	1	480	August 1, 2022
The name in the file_name of participant_audio_files in the recording.completed webhook is different from the name in Get past meeting participants Meetings	0	397	January 22, 2024
Recording.completed payload with empty `recording_end` API and Webhooks	3	539	August 21, 2020

Unicode block character (�) appears in recording.completed webhook

Format Your New Topic as Follows:

Related topics