The 3am problem with silent alerts
Most alerting works beautifully during business hours. An alert fires, a push notification lands, a Slack message pops, a PSA ticket opens, and someone on the team sees it within a minute or two. The on-call rotation is awake, at a keyboard, looking at screens.
Overnight, every one of those channels quietly degrades. Phones go into Do Not Disturb. Notifications get batched, summarized, or silenced entirely. The Slack app is muted until morning. A push notification glowing on a nightstand at 3:14am does nothing if the phone is face-down and silent — and the one incident where that matters most is the one where a customer's entire phone system is down and the morning starts with an angry call instead of a resolved ticket.
The uncomfortable truth for any MSP running a 24×7 promise is that visual notifications are easy to miss precisely when missing them is most expensive. You don't need a louder push notification. You need a different channel — one a sleeping human is conditioned to answer no matter what.
A ringing phone is that channel. People answer phones at 3am. That instinct is exactly what voice-call escalation borrows.
What AI voice-call alerts actually do
When a CRITICAL alert fires in Sikurd on an alert type you've opted into for voice, Sikurd places an outbound phone call to your on-call person. An AI voice answers when they pick up and reads a short, factual script describing the incident — the kind of sentence a tired human can parse on the first listen:
"There is a service down on instance Acme HQ. The service is 3CX Phone System."
The script is built from the alert itself, so it names the right thing for the right incident. A stopped service names the service. An offline trunk names the trunk. An unreachable instance says so by name. For threshold alerts where reading a number aloud would sound robotic — CPU, disk, voice-quality — the script states the condition and the instance and stops there. The goal is one clear spoken sentence: what broke, and on which PBX. Enough to get someone upright and reaching for a laptop knowing what they're about to deal with.
Crucially, the call doesn't replace anything. It runs alongside the channels you already rely on. The same incident still pushes to every device on the account, still opens a PSA ticket in Autotask, ConnectWise, HaloPSA, or Syncro, and still posts to Slack or Teams if you've wired those up. The phone call is the escalation layer on top — the one that exists specifically to defeat a silenced phone.
How the escalation works, step by step
Voice calling is one configuration in Sikurd's alert-routing matrix — a per-alert-type, per-severity switch, alongside push, PSA, and webhook channels. When you turn it on for, say, "Service Down," here's the sequence that runs the moment an instance starts failing a poll:
- A 3CX instance you connected fails a health check — the 3CX Phone System service stopped, an instance went unreachable, all trunks dropped.
- Sikurd creates the alert record and an audit-log entry within about 60 seconds of the failure.
- The configured channels dispatch in parallel: a push notification to every device on the account, a PSA ticket in your connected PSA, and a chat message to Slack or Teams.
- If the alert type has voice calling enabled and the alert is CRITICAL, Sikurd resolves who to call and places an outbound phone call to that number.
- The AI voice reads the incident aloud — instance name, and the specific service or trunk where one applies — then ends the call.
- Your on-call acknowledges by tapping the parallel push notification, which opens the alert in the Sikurd dashboard.
Every channel is fire-and-forget and independent. None of them can block another, and a problem on one never stops the rest. If the voice provider is having a bad night, the push and the ticket still land. That independence is deliberate — escalation that introduces a single point of failure isn't escalation, it's a new way to miss an incident.
- It's opt-in, per alert typeVoice calling is off until you switch it on for a specific alert type. You choose exactly which incidents are worth a ringing phone — nothing calls a phone by surprise.
- It's reserved for CRITICAL severityEven on an enabled alert type, the call only fires for incidents that reach CRITICAL severity. Warnings and informational alerts go to the quieter channels.
- It's best-effort and non-blockingThe call is dispatched independently of every other channel. A provider outage, an unreachable number, or no on-call number on file degrades gracefully — the incident is never dropped, and the failure is logged.
- Acknowledgement is via the push linkThere's no keypad acknowledgement on the call itself yet. The on-call taps the parallel push notification to open and acknowledge the alert in the dashboard.
The on-call resolution chain: who gets the call
The hardest part of any escalation isn't placing the call — it's figuring out who to call at 3am without a human in the loop to decide. Sikurd answers that with a deterministic chain. It tries each step in order and calls the first one that yields a usable US phone number:
- The escalation user. When a call is being driven by a late-stage escalation policy — an earlier alert went unacknowledged and the incident climbed the ladder — Sikurd targets the specific user named on that policy first. If that person has no phone on file, it doesn't stop; it falls through to the rest of the chain, because the incident still deserves a voice escalation to whoever is reachable.
- The tenant's current on-call user. This is the rotation pointer an admin sets when on-call hands off. Flip it at the start of a shift and the right person's phone rings without touching any policy.
- The fallback on-call number. An explicit shared line — a NOC desk, an answering service, a duty phone — used when no specific user is on-call or their personal number isn't set.
- The account owner. The last resort, so a tenant that never configured a rotation at all still gets a call rather than silence.
If the chain runs all the way out and finds no usable number anywhere, Sikurd logs it once and skips the call without raising an error. The push notification has already gone out, so the incident is still in front of the team. The voice call is additive — a bonus wake-up on top of the channels that always fire, never a hard dependency the rest of alerting waits on.
One practical note: every number in the chain is sanity-checked before Sikurd dials, so a malformed or non-US entry is simply passed over rather than producing a failed call.
When to turn it on — and when not to
The single most important thing to get right with voice alerts is restraint. A phone that rings for everything trains its owner to ignore the ring — and then it's no better than the silenced push you were trying to escape. The value of a 3am phone call is entirely in its rarity.
Reserve voice calling for the incidents where someone genuinely needs to wake up and act now:
- Service down — a core 3CX service stopped on a production PBX.
- Instance unreachable — the whole phone system is offline.
- Trunks or SBC down — calls can't get in or out.
- Backup failed — the safety net you bill for didn't run.
- License expiring — a hard deadline that takes the system down if missed.
Leave the quieter signals — a CPU spike that self-resolves, a brief voice-quality dip, an informational notice — on push, chat, and the dashboard. Because voice calling is opt-in per alert type and gated to CRITICAL severity, the product makes this discipline easy to enforce: you turn it on deliberately, type by type, for the short list of incidents that are worth ending someone's sleep over.
How it fits the rest of your alerting
Voice-call alerts aren't a standalone product or a separate vendor to manage — they're the top rung of the same escalation ladder that handles every 3CX incident in Sikurd. The closed-loop PSA ticket still opens and auto-closes on resolve. The push notification still carries the acknowledgement link. Slack and Teams still get the ChatOps message. The audit log still records that the alert fired and who acted on it.
The phone call simply adds the one thing those channels can't guarantee overnight: that a human actually hears it. For an MSP whose reputation rests on catching the 3am outage before the customer does, that last rung is the difference between a quietly resolved ticket and a morning spent apologizing.