Microphone initialization takes too long — retry logic and timeout handling advice

Description

I’m encountering a delay when starting the microphone (and sometimes camera) using the Zoom Web Video SDK. It takes a long time to initialize or times out.
I suspect the delay might be network-related, but also possibly caused by my previous retry logic, which waits until the SDK throws an error (potentially after its own internal timeout) before retrying — resulting in a long delay overall.

Related topic : Camera/Microphone fails to start or takes too long to initialize


Browser Console Error

WebSocket connection to ‘wss://zoomins…zoom.us/…’ failed:
Error in connection establishment: net::ERR_CONNECTION_TIMED_OUT


Which Web Video SDK version?

@zoom/videosdk version: 2.1.0


Video SDK Code Snippets

Old retry logic

try {
  await retry(
    action: (retryCount) async {
      return startAudioProcess();
    },
    maxRetries: 3,
    delay: const Duration(seconds: 2),
  );
} catch (error) {
  logError('Error zoomStartAudio', error: error);
}

Where startAudioProcess() calls:

await useCase.startAudio(
  mute: !isMicEnabled,
  speakerOnly: appStore.isMissingMicPermission ,
);

Updated logic with timeout

try {
  await retry(
    action: (retryCount) async {
      logInfo('[DEBUG] zoomStartAudio startAudioProcess retryCount=$retryCount');
      return startAudioProcess().timeout(
        const Duration(seconds: 5),
        onTimeout: () => throw Exception('Start audio timeout on attempt ${retryCount + 1}'),
      );
    },
    maxRetries: 3,
    delay: const Duration(seconds: 2),
  );
} catch (error) {
  logError('Error zoomStartAudio', error: error);
}

To Reproduce (If applicable)

  1. Join a Zoom Web SDK session.
  2. Attempt to start the microphone via startAudio().
  3. In some cases, initialization takes a long time or fails silently.
  4. Retry logic only triggers after long wait due to no timeout in earlier version.

Screenshots

Not applicable.



Device

  • Device: Dell XPS
  • OS: Windows 11
  • Browser: Chrome / Edge
  • Browser Version: Chrome 125.0+

Additional context

I’m seeking clarification on the following points:

  • How long is the internal timeout for startAudio() in the Zoom Web Video SDK?
  • Is it safe to wrap startAudio() with a manual timeout like this?
  • Does Zoom SDK have internal timeouts for startAudio() or startVideo() which might conflict or stack with ours?
  • What’s the best practice to gracefully retry audio/video initialization (especially under poor network conditions)?

Any suggestions or insight into how the Zoom Web SDK handles these operations internally would be helpful.

Hi @Thong1

Thanks for your feedback.

We do not recommend retrying based on timeouts. If a retry is necessary, please only proceed after startVideo or startAudio explicitly return a rejected promise.

In the Video SDK, startAudio has a built-in timeout of 45 seconds.

Thanks
Vic

Hi @vic.yang, thank you for the clarification!

We do not recommend retrying based on timeouts. If a retry is necessary, please only proceed after startVideo or startAudio explicitly return a rejected promise.

Understood. However, we’ve encountered some challenges in real-world usage and would appreciate your thoughts on this approach:

In our testing, startAudio() or startVideo() sometimes takes a long time to resolve — especially under poor network conditions. Since the internal timeout is 45 seconds, from a user experience perspective, this feels too long. It’s not ideal to make users wait that long without feedback.

That’s why we tried wrapping the SDK call in a shorter timeout (e.g. 5 seconds), since we observed that under normal conditions, both startAudio and startVideo typically complete within 2–3 seconds.

Our question:

  • Do you think it’s reasonable to apply a manual timeout of 5s to startAudio()/startVideo() to avoid long UI freezes?
  • In case of overlapping calls (e.g., when a retry is triggered before the SDK call internally resolves), we’ve seen logs like INVALID_OPERATION. Is this harmful, or just a safe way the SDK rejects redundant calls?

We’re looking for the best way to handle these operations gracefully under unstable network, while keeping UX responsive.

Thank you again for the insights!

Hi @Thong1

  • Do you think it’s reasonable to apply a manual timeout of 5s to startAudio()/startVideo() to avoid long UI freezes?

It’s not reasonable — a 5-second timeout is inappropriate. As you mentioned, it could involve network connections, and even if you retry after a timeout, you’d still need to reconnect, which would take just as much time.

Since startAudio/startVideo are asynchronous methods, you can provide feedback to the user through the UI by indicating that audio/video is being started — for example, by showing a loading indicator or disabling the control area. You shouldn’t automatically retry in such a short time frame.

  • In case of overlapping calls (e.g., when a retry is triggered before the SDK call internally resolves), we’ve seen logs like INVALID_OPERATION. Is this harmful, or just a safe way the SDK rejects redundant calls?

Yes, we reject duplicate calls to prevent inconsistent states.

Thanks
Vic

Hi Vic,

Thanks for the clarification — that’s very helpful. I’ve noted your points and will take them into consideration as we review our implementation.

Appreciate your support!

Best,
Thong