Skip to main content
The Diagnostics panel displays real-time system health metrics inside the MCAP viewer. Monitor CPU, memory, disk, network, GPU, and battery data alongside your sensor recordings to correlate device performance issues with recording artifacts.
The Diagnostics Panel is in preview. Features described on this page may change.

Metrics

The panel subscribes to standard diagnostics topics and renders each metric as a live gauge with sparkline history. All metrics synchronize with the shared viewer timeline.
MetricSource TopicDescription
CPU Usage/diagnostics/cpuPer-core and aggregate CPU utilization
Memory/diagnostics/memoryRSS, heap, available memory
Disk I/O/diagnostics/diskRead/write throughput, queue depth
Network/diagnostics/networkBandwidth, packet loss, latency
GPU/diagnostics/gpuGPU utilization, VRAM, temperature
Battery/diagnostics/batteryCharge level, discharge rate, health
Each metric is rendered as a gauge showing the current value and a sparkline showing recent history. Hover over the sparkline to inspect values at a specific timestamp.

Metric Details

CPU Usage

Displays per-core utilization bars and an aggregate percentage. The panel parses standard diagnostic_msgs/DiagnosticArray messages where the hardware ID matches cpu. Individual core values are extracted from key-value pairs in the values array.
FieldUnitDescription
aggregate_percent%Weighted average across all cores
core_N_percent%Individual core utilization
load_average_1mratio1-minute load average
process_countcountNumber of active processes

Memory

Shows current memory usage as a stacked bar (RSS, heap, buffers) and available memory as a separate gauge.
FieldUnitDescription
rss_bytesbytesResident set size
heap_bytesbytesHeap allocation
available_bytesbytesMemory available for allocation
swap_used_bytesbytesSwap space in use

Disk I/O

Renders read and write throughput as dual sparklines and queue depth as a numeric indicator.
FieldUnitDescription
read_bytes_per_secbytes/sRead throughput
write_bytes_per_secbytes/sWrite throughput
queue_depthcountNumber of pending I/O operations
utilization_percent%Disk busy percentage

Network

Displays bandwidth utilization, packet loss ratio, and round-trip latency.
FieldUnitDescription
tx_bytes_per_secbytes/sTransmit bandwidth
rx_bytes_per_secbytes/sReceive bandwidth
packet_loss_percent%Percentage of dropped packets
rtt_msmsRound-trip latency

GPU

Shows GPU compute utilization, VRAM usage, and die temperature. Supports NVIDIA and AMD GPUs.
FieldUnitDescription
utilization_percent%GPU compute utilization
vram_used_bytesbytesVideo RAM in use
vram_total_bytesbytesTotal video RAM
temperature_celsiusCGPU die temperature
power_wattsWCurrent power draw

Battery

Displays charge level, discharge rate, and estimated remaining runtime.
FieldUnitDescription
charge_percent%Current charge level
discharge_rate_wattsWCurrent discharge rate
voltageVBattery voltage
health_percent%Battery health relative to design capacity
estimated_runtime_secondssEstimated remaining runtime at current draw

Auto-Detection

A topic is assigned to the Diagnostics panel when:
  • Its topic name starts with /diagnostics/, or
  • Its schema name contains diagnostic_msgs (matching schemas like diagnostic_msgs/DiagnosticArray, diagnostic_msgs/DiagnosticStatus)

Topic Health

The panel automatically monitors the health of every topic in the recording by tracking message rates, inter-message latency, and dropped messages.
IndicatorMeaning
GreenHealthy — messages arriving at expected rate
YellowDegraded — message rate below 80% of expected
RedUnhealthy — no messages received for >5s
GrayInactive — topic has no recent data
The expected rate for each topic is inferred from the first 100 messages in the recording. You can also set explicit expected rates in the panel configuration.

Topic Health Table

The table view lists every topic with its current health status, message rate, average latency, and drop count. Click a column header to sort. Click a row to jump to the most recent message from that topic in the timeline.
ColumnDescription
TopicFull topic name
StatusHealth indicator (green/yellow/red/gray)
RateCurrent message rate in Hz
Expected RateInferred or configured expected rate in Hz
Avg LatencyAverage inter-message interval in ms
DropsTotal number of detected dropped messages
Last MessageTimestamp of the most recent message

Customizing Expected Rates

By default, the panel infers expected rates from the first 100 messages on each topic. For topics with variable rates (such as event-driven triggers), this inference may not be accurate. Override the expected rate for specific topics in the panel settings:
{
  "expected_rates": {
    "/lidar/points": 10.0,
    "/camera/front/image": 30.0,
    "/imu/data": 200.0,
    "/robot/state": null
  }
}
Setting a topic’s expected rate to null disables health monitoring for that topic.

Configuration

Customize the Diagnostics panel through the settings pane.

Display Settings

SettingTypeDefaultDescription
visible_metricsmulti-selectAllChoose which metric categories to display
refresh_ratenumber1000How often to update gauges, in milliseconds
sparkline_windownumber30Duration of the sparkline history, in seconds
show_topic_healthbooleantrueShow the topic health table below the gauges

Thresholds

Set warning and critical thresholds for each metric. When a value crosses a threshold, the gauge changes color and a marker is placed on the timeline.
Threshold LevelGauge ColorTimeline Marker
NormalGreenNone
WarningYellowYellow diamond
CriticalRedRed diamond
Example threshold configuration:
{
  "cpu": {
    "warning": 80,
    "critical": 95
  },
  "memory": {
    "warning": 75,
    "critical": 90
  },
  "gpu_temperature": {
    "warning": 80,
    "critical": 95
  }
}

Layout

The Diagnostics panel can be placed in any slot of the multi-window layout. A common arrangement is to dock it along the bottom of the viewer alongside the Plot and Log panels, providing a full system health overview beneath the primary 3D or image views.

Compact Mode

When the panel is docked in a narrow strip, it automatically switches to compact mode. Compact mode shows only the gauge values without sparklines, fitting more metrics into a smaller area. Expand the panel to restore the full view with sparklines and the topic health table.

Timeline Synchronization

The Diagnostics panel is synchronized with all other panels through the shared timeline:
  • Playback updates gauges and sparklines to show metric values at the current timestamp
  • Seeking jumps all metric displays to the selected point in time
  • Frame stepping advances diagnostics data one sample at a time
  • Threshold markers on the timeline are clickable — click a marker to seek to that event

Alert Integration

Connect diagnostics thresholds to fleet-level alert channels. When a metric crosses its configured threshold during a recording, Avala creates a timeline event and optionally triggers a notification.

Configuring Alerts

  1. Open the Diagnostics panel settings and navigate to the Alerts tab.
  2. Select the metric and threshold level that should trigger an alert.
  3. Choose a notification channel (Slack, email, or webhook).
  4. Set the cooldown period to avoid duplicate alerts for sustained threshold violations.
ParameterDescriptionDefault
channelNotification destinationNone
cooldown_secondsMinimum seconds between repeated alerts300
include_snapshotAttach a screenshot of the viewer at the time of the alertfalse
Alert integration requires an active Fleet subscription. Timeline events are created for all users regardless of subscription.

Alert Payload

When an alert fires, the notification includes:
  • Device name and recording ID
  • Metric name, current value, and threshold that was crossed
  • Timestamp of the event
  • Direct link to the recording at the exact timestamp

Example: Slack Alert

{
  "device": "robot-alpha-01",
  "recording": "rec_11223344-5566-7788-99aa-bbccddeeff00",
  "metric": "cpu.aggregate_percent",
  "value": 96.3,
  "threshold": "critical",
  "threshold_value": 95,
  "timestamp": "2026-02-26T10:03:22Z",
  "viewer_url": "https://avala.ai/recordings/rec_11223344?t=1740567802"
}

Exporting Diagnostics Data

Export diagnostics metrics from a recording as CSV or JSON for offline analysis. Open the panel settings menu and select Export Metrics. Choose the time range, metrics to include, and output format.
FormatDescription
CSVOne row per sample, columns for each metric field
JSONArray of timestamped metric objects
Exported files include all raw metric values at the original sample rate, without any downsampling applied by the panel’s display refresh rate.

Keyboard Shortcuts

ShortcutAction
MCycle through metric categories
TToggle the topic health table
CToggle compact mode
RReset all gauges to auto-scale
EOpen the export dialog

Use Cases

CPU monitoring during manipulation

Monitor robot CPU usage during complex manipulation tasks to identify compute bottlenecks that cause control loop delays.

Network latency debugging

Track network latency and packet loss to identify data transmission issues between the robot and remote operator station.

GPU thermal analysis

Correlate GPU temperature spikes with rendering quality drops or inference slowdowns in on-device perception pipelines.

Battery discharge profiling

Profile battery discharge rates across different operational modes to optimize mission duration and plan recharging schedules.
Place the Diagnostics panel next to the Plot panel to overlay system health metrics with sensor signals. This makes it straightforward to correlate a CPU spike with a dropped LiDAR frame or a network latency spike with a delayed camera image.

Next Steps