Diagnostics Panel - Avala Documentation

The Diagnostics panel displays real-time system health metrics inside the MCAP viewer. Monitor CPU, memory, disk, network, GPU, and battery data alongside your sensor recordings to correlate device performance issues with recording artifacts.

The Diagnostics Panel is in preview. Features described on this page may change.

Metrics

The panel subscribes to standard diagnostics topics and renders each metric as a live gauge with sparkline history. All metrics synchronize with the shared viewer timeline.

Metric	Source Topic	Description
CPU Usage	`/diagnostics/cpu`	Per-core and aggregate CPU utilization
Memory	`/diagnostics/memory`	RSS, heap, available memory
Disk I/O	`/diagnostics/disk`	Read/write throughput, queue depth
Network	`/diagnostics/network`	Bandwidth, packet loss, latency
GPU	`/diagnostics/gpu`	GPU utilization, VRAM, temperature
Battery	`/diagnostics/battery`	Charge level, discharge rate, health

Each metric is rendered as a gauge showing the current value and a sparkline showing recent history. Hover over the sparkline to inspect values at a specific timestamp.

Metric Details

CPU Usage

Displays per-core utilization bars and an aggregate percentage. The panel parses standard diagnostic_msgs/DiagnosticArray messages where the hardware ID matches cpu. Individual core values are extracted from key-value pairs in the values array.

Field	Unit	Description
`aggregate_percent`	%	Weighted average across all cores
`core_N_percent`	%	Individual core utilization
`load_average_1m`	ratio	1-minute load average
`process_count`	count	Number of active processes

Memory

Shows current memory usage as a stacked bar (RSS, heap, buffers) and available memory as a separate gauge.

Field	Unit	Description
`rss_bytes`	bytes	Resident set size
`heap_bytes`	bytes	Heap allocation
`available_bytes`	bytes	Memory available for allocation
`swap_used_bytes`	bytes	Swap space in use

Disk I/O

Renders read and write throughput as dual sparklines and queue depth as a numeric indicator.

Field	Unit	Description
`read_bytes_per_sec`	bytes/s	Read throughput
`write_bytes_per_sec`	bytes/s	Write throughput
`queue_depth`	count	Number of pending I/O operations
`utilization_percent`	%	Disk busy percentage

Network

Displays bandwidth utilization, packet loss ratio, and round-trip latency.

Field	Unit	Description
`tx_bytes_per_sec`	bytes/s	Transmit bandwidth
`rx_bytes_per_sec`	bytes/s	Receive bandwidth
`packet_loss_percent`	%	Percentage of dropped packets
`rtt_ms`	ms	Round-trip latency

GPU

Shows GPU compute utilization, VRAM usage, and die temperature. Supports NVIDIA and AMD GPUs.

Field	Unit	Description
`utilization_percent`	%	GPU compute utilization
`vram_used_bytes`	bytes	Video RAM in use
`vram_total_bytes`	bytes	Total video RAM
`temperature_celsius`	C	GPU die temperature
`power_watts`	W	Current power draw

Battery

Displays charge level, discharge rate, and estimated remaining runtime.

Field	Unit	Description
`charge_percent`	%	Current charge level
`discharge_rate_watts`	W	Current discharge rate
`voltage`	V	Battery voltage
`health_percent`	%	Battery health relative to design capacity
`estimated_runtime_seconds`	s	Estimated remaining runtime at current draw

Auto-Detection

A topic is assigned to the Diagnostics panel when:

Its topic name starts with /diagnostics/, or
Its schema name contains diagnostic_msgs (matching schemas like diagnostic_msgs/DiagnosticArray, diagnostic_msgs/DiagnosticStatus)

Topic Health

The panel automatically monitors the health of every topic in the recording by tracking message rates, inter-message latency, and dropped messages.

Indicator	Meaning
Green	Healthy — messages arriving at expected rate
Yellow	Degraded — message rate below 80% of expected
Red	Unhealthy — no messages received for >5s
Gray	Inactive — topic has no recent data

The expected rate for each topic is inferred from the first 100 messages in the recording. You can also set explicit expected rates in the panel configuration.

Topic Health Table

The table view lists every topic with its current health status, message rate, average latency, and drop count. Click a column header to sort. Click a row to jump to the most recent message from that topic in the timeline.

Column	Description
Topic	Full topic name
Status	Health indicator (green/yellow/red/gray)
Rate	Current message rate in Hz
Expected Rate	Inferred or configured expected rate in Hz
Avg Latency	Average inter-message interval in ms
Drops	Total number of detected dropped messages
Last Message	Timestamp of the most recent message

Customizing Expected Rates

By default, the panel infers expected rates from the first 100 messages on each topic. For topics with variable rates (such as event-driven triggers), this inference may not be accurate. Override the expected rate for specific topics in the panel settings:

{
  "expected_rates": {
    "/lidar/points": 10.0,
    "/camera/front/image": 30.0,
    "/imu/data": 200.0,
    "/robot/state": null
  }
}

Setting a topic’s expected rate to null disables health monitoring for that topic.

Configuration

Customize the Diagnostics panel through the settings pane.

Display Settings

Setting	Type	Default	Description
`visible_metrics`	multi-select	All	Choose which metric categories to display
`refresh_rate`	number	`1000`	How often to update gauges, in milliseconds
`sparkline_window`	number	`30`	Duration of the sparkline history, in seconds
`show_topic_health`	boolean	`true`	Show the topic health table below the gauges

Thresholds

Set warning and critical thresholds for each metric. When a value crosses a threshold, the gauge changes color and a marker is placed on the timeline.

Threshold Level	Gauge Color	Timeline Marker
Normal	Green	None
Warning	Yellow	Yellow diamond
Critical	Red	Red diamond

Example threshold configuration:

{
  "cpu": {
    "warning": 80,
    "critical": 95
  },
  "memory": {
    "warning": 75,
    "critical": 90
  },
  "gpu_temperature": {
    "warning": 80,
    "critical": 95
  }
}

Layout

The Diagnostics panel can be placed in any slot of the multi-window layout. A common arrangement is to dock it along the bottom of the viewer alongside the Plot and Log panels, providing a full system health overview beneath the primary 3D or image views.

Compact Mode

When the panel is docked in a narrow strip, it automatically switches to compact mode. Compact mode shows only the gauge values without sparklines, fitting more metrics into a smaller area. Expand the panel to restore the full view with sparklines and the topic health table.

Timeline Synchronization

The Diagnostics panel is synchronized with all other panels through the shared timeline:

Playback updates gauges and sparklines to show metric values at the current timestamp
Seeking jumps all metric displays to the selected point in time
Frame stepping advances diagnostics data one sample at a time
Threshold markers on the timeline are clickable — click a marker to seek to that event

Alert Integration

Connect diagnostics thresholds to fleet-level alert channels. When a metric crosses its configured threshold during a recording, Avala creates a timeline event and optionally triggers a notification.

Configuring Alerts

Open the Diagnostics panel settings and navigate to the Alerts tab.
Select the metric and threshold level that should trigger an alert.
Choose a notification channel (Slack, email, or webhook).
Set the cooldown period to avoid duplicate alerts for sustained threshold violations.

Parameter	Description	Default
`channel`	Notification destination	None
`cooldown_seconds`	Minimum seconds between repeated alerts	`300`
`include_snapshot`	Attach a screenshot of the viewer at the time of the alert	`false`

Alert integration requires an active Fleet subscription. Timeline events are created for all users regardless of subscription.

Alert Payload

When an alert fires, the notification includes:

Device name and recording ID
Metric name, current value, and threshold that was crossed
Timestamp of the event
Direct link to the recording at the exact timestamp

Example: Slack Alert

{
  "device": "robot-alpha-01",
  "recording": "rec_11223344-5566-7788-99aa-bbccddeeff00",
  "metric": "cpu.aggregate_percent",
  "value": 96.3,
  "threshold": "critical",
  "threshold_value": 95,
  "timestamp": "2026-02-26T10:03:22Z",
  "viewer_url": "https://avala.ai/recordings/rec_11223344?t=1740567802"
}

Exporting Diagnostics Data

Export diagnostics metrics from a recording as CSV or JSON for offline analysis. Open the panel settings menu and select Export Metrics. Choose the time range, metrics to include, and output format.

Format	Description
CSV	One row per sample, columns for each metric field
JSON	Array of timestamped metric objects

Exported files include all raw metric values at the original sample rate, without any downsampling applied by the panel’s display refresh rate.

Keyboard Shortcuts

Shortcut	Action
`M`	Cycle through metric categories
`T`	Toggle the topic health table
`C`	Toggle compact mode
`R`	Reset all gauges to auto-scale
`E`	Open the export dialog

Use Cases

CPU monitoring during manipulation

Monitor robot CPU usage during complex manipulation tasks to identify compute bottlenecks that cause control loop delays.

Network latency debugging

Track network latency and packet loss to identify data transmission issues between the robot and remote operator station.

GPU thermal analysis

Correlate GPU temperature spikes with rendering quality drops or inference slowdowns in on-device perception pipelines.

Battery discharge profiling

Profile battery discharge rates across different operational modes to optimize mission duration and plan recharging schedules.

Place the Diagnostics panel next to the Plot panel to overlay system health metrics with sensor signals. This makes it straightforward to correlate a CPU spike with a dropped LiDAR frame or a network latency spike with a delayed camera image.

Next Steps

State Machine Panel

Visualize robot behavior state transitions alongside diagnostics data.

Fleet Dashboard

Monitor diagnostics across your entire fleet from a single view.

Recording Rules

Automatically flag recordings that contain diagnostics threshold violations.

Overview

MCAP & ROS

Panels & Layout

3D Rendering

Fleet Management

Extensions

Guides

​Metrics

​Metric Details

​CPU Usage

​Memory

​Disk I/O

​Network

​GPU

​Battery

​Auto-Detection

​Topic Health

​Topic Health Table

​Customizing Expected Rates

​Configuration

​Display Settings

​Thresholds

​Layout

​Compact Mode

​Timeline Synchronization

​Alert Integration

​Configuring Alerts

​Alert Payload

​Example: Slack Alert

​Exporting Diagnostics Data

​Keyboard Shortcuts

​Use Cases

CPU monitoring during manipulation

Network latency debugging

GPU thermal analysis

Battery discharge profiling

​Next Steps

State Machine Panel

Fleet Dashboard

Recording Rules

Metrics

Metric Details

CPU Usage

Memory

Disk I/O

Network

GPU

Battery

Auto-Detection

Topic Health

Topic Health Table

Customizing Expected Rates

Configuration

Display Settings

Thresholds

Layout

Compact Mode

Timeline Synchronization

Alert Integration

Configuring Alerts

Alert Payload

Example: Slack Alert

Exporting Diagnostics Data

Keyboard Shortcuts

Use Cases

Next Steps