The speaker sometimes does not work when using Puppeteer with a fake media device

Description
We’re using Puppeteer to launch a Chrome instance with the --use-fake-device-for-media-stream flag. However, we occasionally encounter an issue where the speaker doesn’t work—we can’t hear the other participants speaking.

Upon checking the Zoom management page, we found that the microphone status showed an error: getUserMedia failed.

Could you help investigate the cause of this issue? It only occurs intermittently; in most cases, everything works as expected.

Here is the session ID for reference: 3SS0uapFTLyfj7n33YeFGA==

Which Web Video SDK version?
2.1.0

Video SDK Code Snippets

  async startAudio(mute, speakerOnly) {
    try {
      if (speakerOnly) {
        await state.mediaStream.startAudio({ speakerOnly: true });
      } else {
        await state.mediaStream.startAudio({ mute: mute });
      }
    } catch (error) {
      Logger.error('[VIDEO-SDK JS] startAudio error', error);
    }
  }

Screenshots

Device

  • OS: Linux
  • Browser: Chrome: GoogleChromeSAB
  • Browser Version: 134

Hey @lmtruong1512

Thanks for your feedback.

3SS0uapFTLyfj7n33YeFGA==

After analyzing the log, we found that when attempting to capture audio, the browser returned a NotReadableError. This may be due to the improper configuration of the fake media. Did you also set the --use-file-for-fake-audio-capture argument?

If your goal is only to hear remote users’ audio, you can set the speakerOnly option to true.

Thanks
Vic

1 Like

@vic.yang Thanks for your response.

We only want to hear remote users’ audio. If I set the speakerOnly option to true, will startAudio still work and allow us to hear others even if the browser throws a NotReadableError?
Or does setting speakerOnly = true ensure that the NotReadableError will never occur?

Hi @lmtruong1512

If I set the speakerOnly option to true , will startAudio still work and allow us to hear others even if the browser throws a NotReadableError ?

If the speakerOnly option is set to true, we will not capture audio, only play the remote audio, so the NotReadableError will not occur.

Thanks
Vic

1 Like

We got it, thanks, @vic.yang

Hi @vic.yang,

Could you help investigate why we’re unable to render the video stream from other users?
Could this be related to the microphone issue mentioned earlier?

This is the sessionId: rsNXr1RbS0uTT1p+GkzrKA==

Thank you in advance!

Hi @lmtruong1512

rsNXr1RbS0uTT1p+GkzrKA==

After analyzing the logs, we found that the user experienced significant delay, which could be the cause of the video render issue.

Thanks
Vic

1 Like

@vic.yang Could you help troubleshoot whether the delay was caused by a Zoom SDK problem, a network issue, or resource limitations on the client side (e.g., memory or CPU usage)?

Hi @lmtruong1512

rsNXr1RbS0uTT1p+GkzrKA==

Around 5 minutes after the session started, the uplink and downlink of UserID: 16778240 dropped sharply, which could be the cause of the video issue.

Thanks
Vic

1 Like

@vic.yang Could you help investigate the issue where both the microphone and speaker are not accessible for the RECORDING_BOT when starting audio with the speakerOnly option set to true in the following sessions:

9aF8YPhEQgaHxr4hoZwG3w==
UsIoxcsDSgerPxuK2NZwIQ==

await state.mediaStream.startAudio({ speakerOnly: true });

Here is the relevant information from the Zoom Management Page:

@vic.yang In that case, is the RECORDING_BOT able to hear other participants speaking?

Hi @lmtruong1512

9aF8YPhEQgaHxr4hoZwG3w==
UsIoxcsDSgerPxuK2NZwIQ==

After analyzing the logs, these two BOT users did not successfully start audio, possibly due to the browser’s autoplay policy requiring user interaction.

You can listen for the auto-play-audio-failed event.

If you’re using Puppeteer, consider adding the launch flag --autoplay-policy=no-user-gesture-required.

Thanks
Vic

1 Like

@vic.yang Do you know why this happens randomly with the same code? Could it be related to the timing of the startAudio call — maybe it’s being called too soon?"

Hi @lmtruong1512

If stream.startAudio is called immediately upon joining a session, it might fail. Use the auto-play-audio-failed event to get notified of this issue so you can prompt users to interact with the page to restore audio.

Called immediately upon joining a session’ here means there’s no opportunity for user interaction between joining the session and calling startAudio.

For more details on auto-play policy, you can refer to the documentation on MDN.

Thanks
Vic

1 Like

@vic.yang Thank you for the response.

May I ask a bit more? Is it possible that startAudio sometimes doesn’t return an error directly, and instead we need to listen for the auto-play-audio-failed event? Similarly, for video — in cases where errors aren’t caught by startVideo,…— how should I handle those situations?

Hi @lmtruong1512

startAudio has a timeout mechanism — if it fails to join audio within 40 seconds, it will return a timeout error.

When auto-play requirements aren’t met, the Video SDK interrupts the join process and emits the auto-play-audio-failed event. In that event, simply guiding the user to interact with the page will allow the SDK to resume the join process.

startVideo doesn’t have this issue, as browsers don’t impose autoplay restrictions on video capture.

Thanks
Vic

1 Like

@vic.yang Continue follow the reason why BOT users sometimes fail to start audio:

it may be due to the browser’s autoplay policy, which requires user interaction before playing audio.

To start a BOT joining a meeting, we launch a separate EC2 instance for each BOT using a headless browser, which joins the meeting without any manual user interaction. However, this issue only occurs in rare cases — most BOT sessions do not encounter autoplay policy errors.

At this time, we haven’t been able to reliably reproduce the error, which makes it difficult to pinpoint and address the root cause with supporting evidence. Do you have any suggestions or recommendations that might help us investigate further? Thank you very much!

Hi @lmtruong1512

Did you try the --autoplay-policy=no-user-gesture-required launch option?

As for the random cases, they still need to be investigated individually.

Thanks
Vic

Did you try the --autoplay-policy=no-user-gesture-required launch option?

@vic.yang Thank you for your response.

Our problem is that we haven’t been able to reproduce the error to confirm whether it’s actually caused by the autoplay policy (user-gesture-required). Because of this, we’re uncertain if adding --autoplay-policy=no-user-gesture-required will fully resolve the issue.

In normal cases, when we check the autoplay policy via about://media-engagement, it already shows no-user-gesture-required—even without explicitly setting the flag.

We want to avoid this bug occurring again in our production environment. Do you have any suggestions on how to reproduce this issue?

@vic.yang

  1. Could you recommend the best practice for when to start audio?
    Should we call startAudio immediately after joining the session, or wait for a specific event?
  2. Is it safe to call stream.startAudio and stream.switchSpeaker in parallel?
    Are there any potential issues, and what would be the recommended approach for this scenario?

Thank you in advance!