3CX MOS Scoring Explained: What Good Voice Quality Actually Looks Like

MOS is the number that tells you whether calls sound good. Here's what the scale actually means, how Sikurd computes it for 3CX instances, what causes bad scores, and where to draw your alert thresholds.

What MOS actually measures

Mean Opinion Score is an audio-quality rating on a 1–5 scale. It started life in the 1990s as a tool for analog telephony research — sit a room of listeners in front of audio samples, have them rate each one from 1 (Bad) to 5 (Excellent), average the results. That average is the MOS for that sample.

The scale that everyone uses today is:

  • 5.0 — Excellent. Toll-quality audio. Indistinguishable from in-person conversation. Practically unachievable on real-world VoIP; high-quality G.722 wideband on a clean LAN tops out around 4.5.
  • 4.0–4.4 — Good. Calls sound clear, no perceptible defects. The standard SLA target for commercial VoIP.
  • 3.6–3.9 — Fair. Listeners can tell the call is degraded but they can still complete it without asking the other party to repeat. The "noticeable but not blocking" tier.
  • 3.0–3.5 — Poor. Calls work but listeners are starting to ask for repeats. This is where customer tickets begin.
  • Below 3.0 — Bad. Calls are barely usable; people give up and try again later. Outage-equivalent for a VoIP service.

For 3CX specifically: targets vary by codec and use case, but 4.0 is the floor most carriers and SBCs aim for. Anything that drops sustained below 3.5 is a real problem.

Pure MOS vs. inferred MOS

Two flavours of the metric, doing very different things:

Pure MOS (subjective)

The original definition — humans listen to audio samples and score them. Used in carrier-grade testing labs and in the design of new codecs. Useless for live network monitoring because you can't have a person listening to every call in real time.

Inferred MOS (objective, calculated)

The MOS your monitoring tool reports. Computed from objective network measurements — usually round-trip latency, jitter (variance in packet arrival timing), and packet loss percentage. The math comes from the ITU-T G.107 standard, also called the E-Model.

The G.107 formula is approximately:

R = 93.2 – Id – Ie
  where:
  Id = delay impairment (latency-driven)
  Ie = equipment impairment (codec + packet loss + jitter)

MOS = 1 + 0.035·R + 7e-6·R·(R-60)·(100-R)

R is the raw rating factor on a 0–100 scale; the second equation maps it back into the familiar 1–5 MOS scale. You don't need to memorize this — your monitoring tool runs the math. What matters is that the inputs (latency, jitter, loss) are measurable in real time and the output correlates closely enough with how listeners actually rate calls that the industry treats inferred MOS as the working standard.

How Sikurd calculates MOS for 3CX

Sikurd's probe layer measures three things on every poll cycle:

  1. Latency — TCP-connect time to the instance's FQDN on port 5060 (the SIP signaling port). Measures the round-trip from Sikurd's probe network to the PBX's edge.
  2. Jitter — variance in connect time across the last N probes. A connect that took 40ms last minute and 95ms this minute has high jitter — the same audio path will deliver inconsistent packet timing.
  3. Packet loss — fraction of probes that timed out or failed to complete in the rolling window.

Those three numbers go into the G.107 formula above. The result is inferred MOS, surfaced on the instance's Network Quality card and the dashboard's Health & Uptime tab. The probe runs every 60 seconds; the MOS displayed is a moving average over the most recent six probes (so a single bad ping doesn't tank the score).

Caveat: this is a path-level measurement, not a per-call measurement. Sikurd is measuring the network path between its probe network and the 3CX instance's FQDN. That correlates well with the conditions a remote caller would experience reaching the same FQDN — but if the customer's issue is on their internal LAN (a Wi-Fi access point's QoS misconfigured, for example), our probe won't see it. Per-call MOS would require parsing 3CX's CDR logs in real time and is on our roadmap, not in production today.

What causes a low MOS score

1. Latency creep (the slow killer)

ITU-T puts the "good" latency boundary at around 150ms one-way (300ms round-trip). Below that, conversation feels natural. Above 200ms one-way, callers start talking over each other because the audio doesn't arrive in time for the natural pause cadence. By 400ms round-trip, every call feels like satellite-era international calling.

Common causes: a routing change at the customer's ISP that adds a hop, a VPN added between phones and PBX, a TURN-relay added for remote users that adds 50–100ms to every packet. Latency creep is rarely catastrophic — calls still complete — but customers will mention it offhand ("calls feel a bit laggy lately").

2. Jitter (the most common culprit)

Jitter is the variance in packet arrival times. If packets are leaving the source at evenly spaced 20ms intervals but arriving at 15ms / 22ms / 28ms / 12ms, the jitter buffer at the receiver has to decide whether to wait for late packets (delay) or drop them (audio gaps). Either way, the audio degrades.

≤10ms jitter is excellent. 20ms is acceptable. 50ms+ is when audio starts to gap audibly. Jitter is the #1 cause of "the calls sound bad sometimes" tickets that have no obvious culprit.

Common cause: a saturated uplink at the customer site. A 20Mbps cable connection that's running at 18Mbps for the day's backup job has zero headroom for VoIP packets to leave promptly; they queue, jitter spikes, MOS tanks.

3. Packet loss (the obvious one)

Anything above 1% loss starts producing audible audio drops. At 3% loss, every other sentence has a missing syllable. 5%+ and calls are barely usable.

Common causes: a flaky modem, a network cable with intermittent contact, a switch port with bad NICs, an ISP routing through congested paths. Easy to spot with a sustained-loss MOS alert; harder to diagnose without traceroute-style on-call instrumentation.

Where to draw alert thresholds

Three thresholds work well for most MSP fleets:

  • 3.5 → Warning. Sustained MOS below 3.5 for six consecutive minutes. The "something is starting to bother callers" line. Push notification, no escalation.
  • 3.0 → Critical. Sustained MOS below 3.0 for three consecutive minutes. Customers are calling. Push + PSA ticket + (on Pro+) AI voice call to the on-call rotation.
  • Recovery → resolved. Auto-resolve when MOS recovers above 3.7 for ten consecutive minutes. Prevents flap.

Sikurd's default threshold is 3.5 with six-minute sustained logic and an auto-resolve at 3.7. Per-instance overrides are available — a customer in a known-bad fibre-rich region might tolerate a 3.3 floor without complaint, while a white-glove enterprise account might want alerts at 3.7.

The signal MOS catches that nothing else does

The reason MOS matters more than uptime or trunk-registration metrics: it's the only thing that fires for the silent failure mode of "the call connects but it sounds bad."

SIP REGISTER keep-alives are tiny — a few hundred bytes every few minutes. They succeed long after audio quality has degraded. A trunk can report Registered for hours while customers are complaining about choppy calls and lost syllables. The dashboard says everything is green; the phone is ringing with tickets.

MOS is the canary. It measures the conditions that actually determine audio quality. If your MOS drops below 3.5 for an hour, you have a real problem brewing — typically 12 to 48 hours ahead of the first customer ticket. Catching this before the ticket lands is the entire value proposition of inferred MOS monitoring.

What MOS won't tell you

Don't oversell what MOS measures. It is:

  • A network-path quality estimate, not an audio-call recording analysis.
  • Aggregated across time, not per-call.
  • Computed from the probe network to the PBX, not from a remote caller's last-mile network.

A MOS-monitoring tool catches network-path degradation reliably. It won't catch a single bad call due to one user's broken headset. For that you need per-call CDR analysis — different feature, different cost, different value.

Adjacent reading

Frequently asked questions

What is a good MOS score?
MOS runs from 1.0 (unusable) to 5.0 (toll-quality). Anything ≥4.0 is considered Good for VoIP; 3.6–3.9 is Fair (acceptable but noticeable degradation); below 3.6 is Poor and callers will complain. Most carriers target 4.0 as their SLA floor.
How is MOS calculated for 3CX?
Pure MOS comes from a subjective listening test (humans rating calls 1–5) and is impractical to measure live. Inferred MOS is computed from objective network metrics — latency, jitter, packet loss — using the ITU-T G.107 (E-Model) formula. Sikurd probes each instance's FQDN over TCP, measures the same three values, and runs them through G.107 to produce a per-instance score every minute.
What MOS threshold should I set for alerts?
3.5 is the typical "warn" threshold and 3.0 is the typical "page" threshold. Below 3.5 means callers are starting to complain; below 3.0 means the calls are barely usable. Sikurd's default alert fires when inferred MOS stays below 3.5 for six consecutive probes (six minutes), which filters one-off jitter spikes from sustained degradation.
What causes a low MOS score?
Three things move MOS: latency (round-trip ≥150ms starts to hurt), jitter (variance in packet arrival timing — ≥30ms causes audio breakup), and packet loss (≥1% starts to be audible). Each contributes independently. In practice 90% of bad MOS is the customer's ISP doing something — a saturated uplink, a routing change, a faulty modem. The other 10% is a problem at the 3CX side or the carrier's side.
Can MOS scoring catch problems before customers complain?
Yes — that's the whole point. SIP REGISTER keep-alives are tiny UDP packets that succeed long before audio RTP quality fails. A trunk can read "Registered" while customers are hearing choppy audio. MOS-based monitoring catches the gap. A sustained MOS drop is usually 12–48 hours ahead of the first customer ticket.
Does Sikurd monitor MOS in real-time?
Sikurd probes each instance's FQDN on a one-minute cycle and computes inferred MOS from the latency / jitter / loss measurements. It's not call-level (which would require RTP-level packet capture) — it's path-level, which catches the same network conditions that produce bad calls. Per-call MOS would require a 3CX SBC log dump and is on the roadmap, not in production today.

Catch voice-quality problems before customers do.

Sikurd probes every 3CX instance every minute and computes inferred MOS from latency, jitter, and packet loss. Threshold alerts fire 12–48 hours before the first customer ticket. Pro and above.