Skip to main content
The quality of your visualization and annotation experience depends directly on how your data is recorded. This page covers practical recommendations for recording sensor data that uploads reliably, visualizes correctly, and annotates efficiently in Avala.

Use MCAP Format

MCAP is the recommended container format for multi-sensor recordings. It has the lowest container overhead of any robotics recording format and supports all major serialization frameworks.
FormatOverheadSerialization SupportSchema Embedded
MCAPMinimalProtobuf, JSON, ROS 1, ROS 2, FlatbuffersYes
ROS 1 BagModerateROS 1 onlyYes
ROS 2 BagModerate (SQLite)ROS 2 onlyYes
Avala accepts ROS 1 .bag and ROS 2 .db3 files and converts them to MCAP during upload. However, recording directly to MCAP avoids the conversion step, preserves all schema metadata, and produces smaller files.
If you are using ROS 2, record with the MCAP storage plugin:
ros2 bag record -s mcap --all
For ROS 1, convert existing bags with the mcap CLI:
mcap convert input.bag output.mcap

Include Transform Messages

Avala resolves coordinate frames from TF messages in your recording. Without transforms, the viewer cannot project LiDAR onto camera images or align sensors in the 3D view. Record both static and dynamic transform topics:
TopicMessage TypePurpose
/tf_statictf2_msgs/TFMessageFixed sensor mounting positions (published once)
/tftf2_msgs/TFMessageDynamic transforms (vehicle odometry, joint states)
Missing /tf_static messages are the most common cause of LiDAR-to-camera projection failures. If your projection appears misaligned or absent, verify that static transforms are present in the recording.

Verify Your Transform Tree

Before uploading, check that your recording contains a complete transform chain from the LiDAR frame to each camera frame:
# Show topic and schema summary for an MCAP file
mcap info recording.mcap

# Use ROS tools to visualize the transform tree from a live system
ros2 run tf2_tools view_frames
A typical transform tree for an autonomous vehicle:
base_link
├── lidar_top
├── camera_front
├── camera_front_left
├── camera_front_right
├── camera_rear
├── camera_rear_left
├── camera_rear_right
└── imu_link

Record Compressed Images

Uncompressed images dramatically increase file size and upload time. Use compressed image topics whenever possible.
FormatTypical Size (1920x1080)Recommended
sensor_msgs/Image (raw)~6 MB per frameNo
sensor_msgs/CompressedImage (JPEG)~200-400 KB per frameYes
foxglove.CompressedImage (JPEG)~200-400 KB per frameYes
At 30 Hz across 6 cameras, raw images produce ~1 GB per second. JPEG compression reduces this to ~50-70 MB per second — a 15x reduction.
JPEG compression at quality 85-95% is visually lossless for most annotation tasks. Lower quality (below 80%) may introduce artifacts that affect annotation accuracy, especially for small objects and fine boundaries.

Use Consistent Topic Naming

Consistent naming makes it easier to manage recordings across vehicles, sessions, and time periods. The viewer’s auto-detection works best with predictable topic names. Recommended naming convention:
/sensors/{sensor_type}/{position}/{data_type}
Examples:
TopicSensor
/sensors/camera/front/compressedFront camera (compressed image)
/sensors/camera/front/camera_infoFront camera intrinsics
/sensors/lidar/top/pointsTop LiDAR point cloud
/sensors/imu/dataIMU readings
/sensors/gps/fixGPS position
For multi-vehicle fleets, prefix topics with a vehicle identifier or store the vehicle ID in MCAP metadata. This makes it easy to filter and organize recordings by vehicle: /vehicle_01/sensors/camera/front/compressed.

Set Accurate Timestamps

Avala uses message timestamps (the header.stamp field in ROS messages) for synchronization across panels. All sensors in a recording must share a common time reference for the viewer to align them correctly.

Timestamp Best Practices

  • Use a hardware time sync (PTP or GPS-disciplined clock) to synchronize sensor clocks before recording
  • Set header.stamp to capture time, not publish time — the timestamp should reflect when the sensor observation occurred, not when the message was sent
  • Avoid using rospy.Time.now() or rclpy.clock for sensor timestamps unless the system clock is synchronized to the sensor clock

Clock Skew

If sensors have different clocks and timestamps drift apart, the viewer will display data from different real-world moments in the same frame. Common symptoms:
  • Camera images appear to “lag” behind LiDAR data
  • LiDAR projection is offset from where objects appear in the image
  • Object positions jump when stepping frame-by-frame
If you see these issues, check the timestamp alignment in your recording before re-uploading.

Embed Schema Definitions

MCAP files can embed schema definitions alongside the data. When schemas are embedded, Avala can decode messages without any external schema files. For Protobuf messages, embed the .proto file descriptor set when writing the MCAP. Most MCAP writer libraries (C++, Python, Go) include schema embedding by default — check your recording library’s documentation to confirm it is enabled. For ROS messages, schemas are embedded by default in both .bag and .db3 formats. When converting to MCAP, use the mcap convert command to preserve them.
If schemas are not embedded, the viewer falls back to keyword-based topic detection and may assign incorrect panel types. Embedding schemas ensures reliable decoding and correct panel assignment.

File Size Guidelines

Data TypeRecommendation
MCAPKeep recordings under a few GB per file. Split long recordings into shorter segments.
Point Cloud (PCD/PLY)Downsample dense scans if file sizes exceed a few hundred MB.
Video (MP4/MOV)Split long videos into shorter clips.
Image (JPEG/PNG/WebP)Use JPEG for photos, PNG for synthetic data.
Large MCAP files may take several minutes to process after upload. Splitting long recordings into shorter segments (2—5 minutes each) improves upload reliability and reduces processing time.
Refer to the mcap CLI documentation for tools to inspect and split MCAP files.
Splitting a recording creates new files with independent timelines. Object tracking annotations do not carry across split boundaries. Choose split points during low-activity moments (e.g., vehicle stopped at an intersection) to minimize tracking disruption.

Naming Conventions for Multi-Vehicle Datasets

When collecting data across a fleet, adopt a naming convention that encodes key metadata:
{vehicle_id}_{date}_{route}_{segment}.mcap
Examples:
av_042_20260215_highway_101_seg_001.mcap
av_042_20260215_highway_101_seg_002.mcap
av_017_20260215_downtown_sf_seg_001.mcap
You can also store metadata (vehicle ID, route, driver, weather conditions) in MCAP file-level metadata fields. Avala reads these and makes them available as filterable metadata on dataset items.

Pre-Upload Checklist

Before uploading recordings to Avala, verify:
  • Recording is in MCAP format (or ROS bag for automatic conversion)
  • /tf_static and /tf topics are present with complete transform chains
  • Camera images use compressed format (CompressedImage)
  • CameraInfo messages are published for each camera topic
  • Message timestamps are accurate and synchronized across sensors
  • Schema definitions are embedded in the MCAP file
  • Large recordings are split into shorter segments for faster processing
  • Topic names follow a consistent naming convention

Next Steps