Get notified when fleet conditions change. Configure alert channels, define routing rules, and manage the alert lifecycle — from detection through acknowledgment to resolution — so issues get surfaced to the right people at the right time.
Fleet Management is in preview. APIs and features described on this page may change.
Alert Channels
An alert channel is a configured destination for notifications. Create channels for each communication path your team uses, then reference them in recording rules and alert policies.
| Channel | Configuration | Use Case |
|---|
| Slack | Webhook URL + channel name | Team-wide notifications, on-call channels |
| Email | Recipient email addresses | Individual alerts, management summaries |
| Webhook | Endpoint URL + signing secret | Custom integrations, PagerDuty, Opsgenie |
| In-App | Automatic (no configuration) | Dashboard notifications for all org members |
In-app notifications are always enabled. Every alert that fires appears in the notification bell in Mission Control, regardless of other channel configurations.
Create a Slack Channel
from avala import Client
client = Client()
channel = client.fleet.alerts.channels.create(
type="slack",
name="Robotics Alerts",
config={
"webhook_url": "https://hooks.slack.com/services/T00/B00/xxxx",
"channel": "#robotics-alerts",
"username": "Avala Fleet Bot",
"icon_emoji": ":robot_face:"
}
)
print(f"Channel created: {channel.name} ({channel.uid})")
Create an Email Channel
channel = client.fleet.alerts.channels.create(
type="email",
name="Engineering Leads",
config={
"recipients": ["lead@example.com", "oncall@example.com"],
"subject_prefix": "[Fleet Alert]"
}
)
Create a Webhook Channel
Webhook channels send an HTTP POST to your endpoint with the alert payload. Use a signing secret to verify that requests originate from Avala.
channel = client.fleet.alerts.channels.create(
type="webhook",
name="PagerDuty Integration",
config={
"url": "https://events.pagerduty.com/v2/enqueue",
"signing_secret": "whsec_your_signing_secret",
"headers": {
"Content-Type": "application/json"
}
}
)
Webhook Payload
Webhook channels deliver a JSON payload with the following structure:
{
"uid": "alt_abc123",
"rule_id": "rul_def456",
"rule_name": "High Latency Alert",
"severity": "warning",
"status": "open",
"device": {
"uid": "dev_ghi789",
"name": "robot-arm-01"
},
"recording": {
"uid": "rec_jkl012",
"url": "https://avala.ai/recordings/rec_jkl012"
},
"message": "High latency detected: 142ms (threshold: 100ms)",
"metadata": {
"latency_ms": 142,
"threshold_ms": 100,
"topic": "/diagnostics/latency"
},
"triggered_at": "2026-01-15T10:30:00Z"
}
To verify webhook authenticity, compute an HMAC-SHA256 of the request body using your signing secret and compare it to the X-Avala-Signature header. See the webhooks integration guide for implementation examples.
Test a Channel
Send a test notification to verify your channel configuration before wiring it up to rules.
result = client.fleet.alerts.channels.test(channel_id="ch_abc123")
if result.success:
print("Test notification sent successfully")
else:
print(f"Test failed: {result.error}")
List and Manage Channels
# List all channels
channels = client.fleet.alerts.channels.list()
for ch in channels:
print(f"{ch.name} ({ch.type}) -- {ch.uid}")
# Update a channel
client.fleet.alerts.channels.update(
channel_id="ch_abc123",
name="Robotics Alerts (Updated)",
config={"channel": "#robotics-alerts-v2"}
)
# Delete a channel
client.fleet.alerts.channels.delete(channel_id="ch_abc123")
Deleting a channel removes it from all rules and alert policies that reference it. Those rules will continue to evaluate, but the notify action targeting the deleted channel will be skipped. Update your rules to reference a different channel before deleting.
Alert Rules
Alerts are linked to recording rules through the notify action. When a recording rule’s condition matches, its notify action fires an alert to the specified channel.
Severity Levels
Configure severity on the recording rule’s notify action to control how alerts are displayed and routed.
| Severity | Description | Default Behavior |
|---|
critical | Requires immediate attention | All channels notified, repeated every 5 minutes until acknowledged |
warning | Needs investigation | Configured channels notified once |
info | Informational, no action required | In-app notification only (unless explicitly routed to other channels) |
# Recording rule with severity-based alert routing
rule = client.fleet.rules.create(
name="Motor Overheat Critical",
condition={
"type": "threshold",
"topic": "/sensors/temperature",
"field": "temp_c",
"operator": "gt",
"value": 90
},
actions=[
{"type": "create_event", "event_type": "error", "label": "Motor overheat"},
{
"type": "notify",
"channel_id": "ch_slack_oncall",
"severity": "critical",
"message": "Motor temperature exceeded 90C -- immediate shutdown may be required"
}
]
)
Escalation Policies
For critical alerts, define an escalation policy that notifies additional channels if the alert is not acknowledged within a specified time.
policy = client.fleet.alerts.policies.create(
name="Critical Escalation",
steps=[
{
"channel_id": "ch_slack_oncall",
"wait": "5m"
},
{
"channel_id": "ch_email_leads",
"wait": "15m"
},
{
"channel_id": "ch_webhook_pagerduty",
"wait": None # Final step, no further escalation
}
]
)
# Attach policy to a rule's notify action
rule = client.fleet.rules.create(
name="Critical System Failure",
condition={
"type": "absence",
"topic": "/system/heartbeat",
"timeout": "60s"
},
actions=[
{
"type": "notify",
"policy_id": policy.uid,
"severity": "critical"
}
]
)
Alert Lifecycle
Every alert moves through a defined set of states. You can transition alerts manually via the SDK, CLI, or dashboard.
open --> acknowledged --> resolved
| ^
+------------------------------+
(auto-resolve)
| State | Description |
|---|
open | Condition matched, notifications sent. Alert is active and may escalate. |
acknowledged | A team member has seen the alert and is investigating. Escalation pauses. |
resolved | The issue has been addressed. Alert is closed. |
Acknowledge an Alert
Acknowledging an alert stops escalation and signals to the team that someone is investigating.
client.fleet.alerts.acknowledge(
alert_id="alt_abc123",
note="Investigating -- checking motor temperature logs"
)
Resolve an Alert
Mark an alert as resolved when the underlying issue has been fixed.
client.fleet.alerts.resolve(
alert_id="alt_abc123",
note="Fixed: firmware update applied to motor controller, temperature nominal"
)
Auto-Resolve
Alerts can auto-resolve when their triggering condition clears. Enable auto-resolve on the recording rule’s notify action:
rule = client.fleet.rules.create(
name="High CPU Usage",
condition={
"type": "threshold",
"topic": "/system/diagnostics",
"field": "cpu_percent",
"operator": "gt",
"value": 90,
"window": "5m"
},
actions=[
{
"type": "notify",
"channel_id": "ch_abc123",
"severity": "warning",
"auto_resolve": True,
"resolve_after": "10m"
}
]
)
When auto_resolve is enabled, the alert transitions to resolved after the resolve_after duration if the condition no longer matches. If the condition re-triggers during the resolve window, the alert stays in its current state.
Alert History
View past alerts with their full lifecycle — when they fired, who acknowledged them, and how they were resolved.
# List recent alerts
alerts = client.fleet.alerts.list(
status="resolved",
since="2026-01-01T00:00:00Z",
until="2026-02-01T00:00:00Z"
)
for alert in alerts:
print(f"[{alert.severity}] {alert.rule_name}")
print(f" Triggered: {alert.triggered_at}")
print(f" Acknowledged: {alert.acknowledged_at} by {alert.acknowledged_by}")
print(f" Resolved: {alert.resolved_at}")
print(f" Duration: {alert.duration}")
print()
# Filter by severity
critical_alerts = client.fleet.alerts.list(severity="critical")
# Filter by device
device_alerts = client.fleet.alerts.list(device_id="dev_abc123")
# Filter by channel
slack_alerts = client.fleet.alerts.list(channel_id="ch_abc123")
Alert Metrics
Aggregate alert statistics for reporting and SLA tracking.
metrics = client.fleet.alerts.metrics(
since="2026-01-01T00:00:00Z",
until="2026-02-01T00:00:00Z"
)
print(f"Total alerts: {metrics.total}")
print(f" Critical: {metrics.by_severity['critical']}")
print(f" Warning: {metrics.by_severity['warning']}")
print(f" Info: {metrics.by_severity['info']}")
print(f"Mean time to acknowledge: {metrics.mean_time_to_acknowledge}")
print(f"Mean time to resolve: {metrics.mean_time_to_resolve}")
Muting and Snoozing
Temporarily silence alerts during planned maintenance, known outages, or testing periods. Muted alerts still evaluate and log, but notifications are suppressed.
Mute a Rule
Suppress all notifications from a specific rule for a set duration.
client.fleet.alerts.mute(
rule_id="rul_abc123",
duration="2h",
reason="Scheduled maintenance on warehouse B robots"
)
Mute a Device
Suppress all alerts for a specific device. Useful when taking a device offline for servicing.
client.fleet.alerts.mute_device(
device_id="dev_abc123",
duration="4h",
reason="Firmware upgrade in progress"
)
Snooze an Alert
Snooze a specific open alert to temporarily suppress its notifications. The alert returns to open state when the snooze expires.
client.fleet.alerts.snooze(
alert_id="alt_abc123",
duration="30m",
reason="Known issue, fix deploying in next release"
)
View Active Mutes
mutes = client.fleet.alerts.mutes.list()
for mute in mutes:
print(f"{mute.target_type}: {mute.target_id}")
print(f" Reason: {mute.reason}")
print(f" Expires: {mute.expires_at}")
# Unmute early
client.fleet.alerts.unmute(mute_id="mute_abc123")
Schedule mutes in advance for recurring maintenance windows. Use the starts_at parameter to create mutes that activate at a future time:client.fleet.alerts.mute(
rule_id="rul_abc123",
starts_at="2026-02-01T02:00:00Z",
duration="2h",
reason="Weekly maintenance window"
)
Next Steps