Alert Routing
Control which channels each alert type fires through. Useful for keeping push notifications focused on the most urgent issues while still creating PSA tickets for everything that needs follow-up.
| Alert Type | Min Severity | Mobile Push | PSA Ticket | Webhooks | AI Voice Call |
|---|---|---|---|---|---|
Instance Down PBX unreachable from poll worker | Info & up (all) | ||||
Service Down A 3CX system service stopped running | Info & up (all) | ||||
License Expiring 10 / 3 / 1 day reminder before expiry | Info & up (all) | ||||
License Threshold Simultaneous-call usage near limit | Info & up (all) | ||||
Trunk Down SIP trunk lost registration | Warning & up | ||||
Backup Failed / Stale No successful backup in 7+ days | Info & up (all) | ||||
High Disk Usage Disk usage ≥ 85% | Critical only | ||||
High CPU CPU sustained ≥ threshold | Info & up (all) | ||||
High Memory Memory usage near limit | Info & up (all) | ||||
Concurrent Calls Live calls near license simCall limit | Info & up (all) |
Rules apply per tenant. Default for any alert type without a configured rule is push + PSA + webhook on at INFO+ severity (voice calls are opt-in). Min severity gates the whole row: an alert below the floor fires nothing. Existing PSA / webhook / push integrations must still be configured separately under Settings.
The noise problem: why most 3CX alerting trains you to ignore it
Anyone who has run monitoring at scale knows the failure mode. A SIP trunk drops at 11 p.m. The tool notices. It notices again sixty seconds later, and again, and again — one notification per polling cycle until someone gets up and fixes it. By morning your phone has ninety push notifications for a single problem, your PSA has a stack of duplicate tickets, and your Slack channel is unreadable.
This is alert fatigue, and it is worse than no alerting at all. When every event looks identical and most of them are repeats of something you already know, your team learns to swipe the notifications away without reading them — which is exactly the moment a genuinely new, genuinely urgent alert slips past unseen. The goal of good alerting isn't more alerts. It's one alert per problem, delivered to the right place, that goes away on its own when the problem does.
That's the design behind Sikurd's Smart Alerts. Below is what actually happens under the hood — every behavior here is how the monitoring worker runs in production.
What Sikurd alerts on
On every poll cycle, Sikurd evaluates each instance against a set of conditions and raises a typed alert when one trips. The core set:
- Instance down — the PBX is unreachable. Fires CRITICAL.
- Service down — a 3CX service (audio provider, call manager, and so on) has stopped. Fires CRITICAL, named by the specific service.
- Trunk down — one or more SIP trunks (or SBCs, called out by name) are unregistered. Fires WARNING.
- Backup failed — the last backup failed, or no successful backup has landed in seven days.
- High disk — disk usage at or above 85% (CRITICAL at 95%+).
- License threshold — simultaneous-call usage at 90%+ of the licensed cap (CRITICAL at 100%).
- License expiry — the 3CX license is approaching its expiry date; newer milestones supersede older ones.
- MOS degraded — inferred voice quality has stayed below threshold for several consecutive probes.
Each alert carries a type, a severity (INFO / WARNING / CRITICAL), and a human-readable message. The type is what the rest of the system keys off — for deduplication, routing, muting, and resolution.
Deduplication: one open alert per problem
Before Sikurd creates an alert, it looks for an existing unresolved alert of the same type on the same instance. If one already exists, it does nothing — no new row, no duplicate notification. That single rule is what stops alert storms. A trunk that has been down since last night is represented by exactly one open alert, no matter how many poll cycles have observed it.
Service-down alerts dedupe one level finer, by the specific service that stopped. If your audio provider and your call manager both go down, you get two distinct alerts — because they're two different problems needing two different fixes. But the same service appearing stopped on cycle after cycle still collapses to a single alert. You see what's broken, once, with nothing buried.
Auto-resolve: alerts that clean up after themselves
Here's the part most tools skip. On every poll, Sikurd doesn't just check for new problems — it re-checks whether the open ones still hold. The instance answered? Any open instance-down alert is resolved. The trunk re-registered? Trunk-down resolves. Disk fell back under threshold, a fresh successful backup landed, the license got renewed well clear of expiry — each open alert is closed the moment its condition no longer applies.
When an alert auto-resolves, Sikurd writes a timeline entry recording how long it was open ("auto-resolved by poll cycle, open 47 min"), and — this is the part MSPs care about — if the alert had opened a PSA ticket, that ticket is closed in the same step. The loop closes itself. Your dashboard reflects reality without anyone clicking "resolve" on a problem that fixed itself.
One honest caveat: auto-resolve is poll-driven. Resolution lands on the next monitoring cycle after the condition clears, not literally the instant it clears. In practice that's the difference between "self-cleaning" and "real-time," and it's the right trade — it means resolution is grounded in an actual fresh observation, not a guess.
Routing: the right alert to the right channel
Not every alert deserves the same treatment. A WARNING-level trunk flap might warrant a quiet PSA ticket; a CRITICAL service outage on a flagship customer might warrant a phone call. Sikurd gives each tenant a routing matrix with one row per alert type, and each row independently controls four channels:
- Push — mobile and desktop push notifications.
- PSA — closed-loop ticket creation in Autotask, ConnectWise, HaloPSA, or Syncro.
- Webhook — Slack and Teams notifications for your incident channel.
- Voice call — an AI voice call to your on-call contact for the alerts that truly can't wait.
Every row also has a minimum severity gate. Set a type's floor to CRITICAL and any WARNING-level firing of that type stays silent on every channel — useful for "only wake me for the bad stuff." With no custom row configured, the default is push, PSA, and webhook on at INFO and above. Voice calls default to off: they're intrusive and best reserved for deliberate opt-in, so no one is ever called by surprise.
Maintenance windows: silence by design
Planned work shouldn't page anybody. Schedule a maintenance window on an instance with a start and end time, and while the clock sits inside that window Sikurd skips alert evaluation for that instance entirely. Reboot the PBX, swap a trunk, run an upgrade — no instance-down alert, no trunk-down ticket, no 2 a.m. phone call for something you're doing on purpose. When the window ends, alerting picks back up on the very next poll. The noise you suppressed was planned; the noise you keep is real.
Mute: "I know — stop pinging me"
Sometimes a problem is real, known, and not getting fixed tonight. For that there's mute. Acknowledging an alert with mute silences its notification channels for that instance-and-type pair for 24 hours. Crucially, the alert stays on the board and the audit trail is intact — you're quieting the pager, not deleting the problem. After 24 hours, if the condition still holds, the next poll re-fires it so it can't be silently forgotten. And if you move the linked PSA ticket to "in progress," Sikurd reads that as "someone's on it" and acknowledges the alert for you.
Escalation: a safety net when the first responder misses it
Acknowledgement is only as good as the person who's awake to do it. Escalation policies cover the gap. A policy names a set of alert types — tenant-wide or scoped to a single instance — and a wait window. If a matching alert stays both unresolved and unacknowledged past that window, Sikurd escalates it to a designated backup contact via push, and (where voice-call alerts are enabled for that type and your account) an AI voice call as well. Acknowledging the alert stops the escalation, and each alert escalates only once, so the safety net never becomes its own source of noise.
The whole point: signal, not noise
Put the pieces together and you get alerting that respects your attention. Dedupe means one alert per problem. Auto-resolve means it disappears — along with its PSA ticket — the moment it's fixed. Routing means each type reaches the right channel at the right severity. Maintenance windows mute the noise you planned, mute handles the noise you've acknowledged, and escalation guarantees the truly urgent thing reaches a human even if the first one missed it. The result is a dashboard you can actually trust, because what's on it is what's wrong right now — nothing more.
Adjacent reading
- How to monitor 3CX trunk health — what a trunk-down alert is built on.
- 3CX MOS scoring explained — the voice-quality signal behind MOS-degraded alerts.
- AI voice alerts vs PagerDuty — the on-call escalation story in depth.