Does zoom natively support audio recording at 16hkz natively?

Does zoom natively support audio recording at 16hkz natively? This means I will get the recording at 16hkz without resampling?

Welcome to the community! We’re thrilled to have you here — great first question!

To directly answer your question: No, Zoom does not natively deliver recordings at 16 kHz without resampling.

While Zoom’s internal audio engine and real-time media streams do operate at 16 kHz (especially for speech and SDK-based streams), the saved recording files are always resampled and transcoded — local recordings to AAC at 44.1 kHz stereo, and cloud recordings to 48 kHz in high-fidelity modes. There is no built-in setting to preserve a 16 kHz sampling rate in the output file.

Quick summary:

  • :white_check_mark: Internally supported: Zoom’s audio engine operates at 16 kHz in real-time
  • :cross_mark: Recorded output at 16 kHz (without resampling): Not natively — recordings are resampled to 44.1 kHz (local) or 48 kHz (cloud)
  • :wrench: Workarounds:
    • Use a third-party tool (e.g., OBS Studio, Audacity) configured to capture at 16 kHz during your Zoom session
    • Use a custom SDK integration that intercepts the real-time 16 kHz audio stream directly

Hope that clears things up! Feel free to ask if you have more questions — we’re happy to have you in the community!

Thank you for the reply! Could I have a follow-up re: the workaround below? How to implement this? Any advice?

Jumping in here: The cleanest way to implement Pranjal’s workaround today is to use Zoom RTMS rather than Zoom’s completed recording files. Create a Zoom General App, add Realtime Media Streams, subscribe to the RTMS started/stopped events, and request only the meeting:read:meeting_audio scope if audio is all you need.

At runtime, start the RTMS session with the REST API or Zoom Apps JS SDK. When Zoom sends meeting.rtms_started, your backend connects to the WebSocket URL, performs the signaling and media handshakes, then receives audio packets. Zoom documents RTMS meeting audio as raw PCM L16 at 16 kHz mono, so your service can base64-decode the audio payloads and write them into a WAV container with 16,000 Hz, 1 channel, and 16-bit PCM.

Recall.ai is a Zoom RTMS Preferred Partner, so you can use Recall.ai for this capture flow instead of building the RTMS WebSocket, auth, and recording pipeline yourself.

@ll11 what is the end to end scenario which you are trying to do? There are many ways to downsample, the post meeting ones are the easiest, and the real time ones are slightly more technical.

The end to end scenario is that I need to collect hundreds or thousands of hours of 16 khz speech recording remotely for a multimodal. The audio needs to be natively saved as 16khz and it can be bot to human to human to human single turn or multi turn cases.

@ll11 are you using cloud recording or local recording for the audio? Since this is not a time sensitive scenarion, you will want to do a batch downsampling of the audio from the recordings to 16hkz. This is not related to Zoom Developer technologies, but I would use ffmpeg to do the conversion of these.