what is acceptible jitter for voip and videoconferencing?

William Herrin bill at herrin.us
Thu Sep 21 22:31:53 UTC 2023


On Thu, Sep 21, 2023 at 6:28 AM Tom Beecher <beecher at beecher.cc> wrote:
> My understanding has always been that 30ms was set based on human perceptibility. 30ms was the average point at which the average person could start to detect artifacts in the audio.

Hi Tom,

Jitter doesn't necessarily cause artifacts in the audio. Modern
applications implement what's called a "jitter buffer." As the name
implies, the buffer collects and delays audio for a brief time before
playing it for the user. This allows time for the packets which have
been delayed a little longer (jitter) to catch up with the earlier
ones before they have to be played for the user. Smart implementations
can adjust the size of the jitter buffer to match the observed
variation in delay so that sound quality remains the same regardless
of jitter.

Indeed, on Zoom I barely noticed audio artifacts for a friend who was
experiencing 800ms jitter. Yes, really, 800ms. We had to quit our
gaming session because it caused his character actions to be utterly
spastic, but his audio came through okay.

The problem, of course, is that instead of the audio delay being the
average packet delay, it becomes the maximum packet delay. You start
to have problems with people talking over each other because when they
start they can't yet hear the other person talking. "Sorry, go ahead.
No, you go ahead."

Regards,
Bill Herrin


-- 
William Herrin
bill at herrin.us
https://bill.herrin.us/


More information about the NANOG mailing list