Data Types

Avala supports five data modalities, each with purpose-built annotation workflows and tooling. This page covers supported formats, capabilities, and upload requirements for each type.

Image

Single-frame images are the most common data type for object detection, classification, and segmentation tasks. Supported formats: JPEG, PNG, WebP Annotation workflow: Each image is annotated independently as a single frame. All 2D annotation tools are available. Use cases: Object detection, instance segmentation, semantic segmentation, image classification, keypoint detection.

Video

Video files are automatically converted to frame sequences on upload, enabling frame-by-frame annotation with object tracking across frames. Supported formats: MP4, MOV Annotation workflow: Videos are split into individual frames grouped as a sequence. Annotators navigate frame-by-frame and can track objects across the timeline. Object IDs persist across frames for consistent tracking. Use cases: Object tracking, action recognition, temporal event detection, driving scene labeling.

Video processing happens in the background after upload. Large videos may take several minutes to convert. You can monitor sequence status in Mission Control or via the API.

LiDAR / Point Cloud

3D point cloud data from LiDAR sensors, used for 3D object detection and scene understanding. Supported formats: PCD, PLY Annotation workflow: Point clouds are rendered in a 3D viewer with bird’s-eye view, perspective view, and side views. Annotators place 3D cuboids with full position, dimension, and rotation control. Use cases: 3D object detection, autonomous driving perception, robotics navigation, scene reconstruction.

MCAP

MCAP is a multi-sensor container format commonly used in robotics and autonomous vehicle development. It packages camera images, LiDAR scans, IMU data, and other sensor streams into a single recording. Supported formats: MCAP (with ROS message support) Annotation workflow: Avala parses MCAP files to extract and synchronize sensor streams. Camera images are displayed alongside projected LiDAR data, enabling multi-camera annotation with 3D context. Annotators can work across camera views with consistent 3D cuboid projections. Use cases: Multi-sensor fusion, surround-view perception, autonomous vehicle data labeling, robotics sensor calibration.

MCAP support includes automatic extraction of camera intrinsics and extrinsics for accurate LiDAR-to-camera projection. See the MCAP / ROS integration guide for setup details.

Splat

3D Gaussian Splat data for annotating reconstructed 3D scenes. Supported formats: Gaussian Splat Annotation workflow: Splat scenes are rendered in a 3D viewer where annotators can navigate the reconstructed environment and place 3D annotations directly in the scene. Use cases: 3D scene understanding, novel view synthesis annotation, spatial AI training data.

Capabilities Comparison

The following table shows which annotation tools are available for each data type:

Annotation Tool	Image	Video	Point Cloud	MCAP	Splat
Bounding Box	Yes	Yes	—	—	—
Polygon	Yes	Yes	—	—	—
3D Cuboid	—	—	Yes	Yes	Yes
Segmentation	Yes	Yes	—	—	—
Polyline	Yes	Yes	—	—	—
Keypoints	Yes	Yes	—	—	—
Classification	Yes	Yes	Yes	Yes	Yes
Object Tracking	—	Yes	Yes	Yes	—

Upload Requirements

Property	Limit
Max file size (images)	20 MB per file
Max file size (video)	2 GB per file
Max file size (point cloud)	500 MB per file
Max file size (MCAP)	5 GB per file
Supported image formats	JPEG, PNG, WebP
Supported video formats	MP4, MOV
Supported point cloud formats	PCD, PLY
Supported multi-sensor formats	MCAP

Upload limits may vary depending on your plan. Contact support@avala.ai if you need to upload files that exceed these limits.

Getting Started

Image

Video

LiDAR / Point Cloud

MCAP

Splat

Capabilities Comparison

Upload Requirements

Next Steps

Managing Datasets

MCAP / ROS Integration

Getting Started

​Image

​Video

​LiDAR / Point Cloud

​MCAP

​Splat

​Capabilities Comparison

​Upload Requirements

​Next Steps

Managing Datasets

MCAP / ROS Integration

Image

Video

LiDAR / Point Cloud

MCAP

Splat

Capabilities Comparison

Upload Requirements

Next Steps