Model Inference - Avala Documentation

Avala can call your ML models to generate pre-annotations, turning a blank canvas into a head start for human annotators. Connect a SageMaker endpoint or any custom HTTP model server, and Avala will send assets to it, receive predictions, and render them as editable annotations in the labeling editor.

Supported Providers

Provider	Status	Description
Amazon SageMaker	Available	Managed inference endpoints with IAM-based authentication. Currently supports SAM (Segment Anything Model) and YOLO.
Custom HTTP Endpoint	Coming Soon	Any HTTP server that accepts a POST request and returns predictions in Avala’s format.

Custom HTTP endpoint support is under development. Currently, inference is available through Amazon SageMaker with built-in SAM and YOLO models. The self-service configuration UI described below is planned — model setup is currently handled by the Avala team during onboarding.

Amazon SageMaker Setup

IAM Role

Create an IAM role that allows Avala to invoke your SageMaker endpoint:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": "sagemaker:InvokeEndpoint",
      "Resource": "arn:aws:sagemaker:us-east-1:YOUR_ACCOUNT_ID:endpoint/your-endpoint-name"
    }
  ]
}

Set up a trust relationship so Avala’s AWS account can assume this role. The Avala account ID is provided in Mission Control during configuration.

Endpoint Configuration

Your SageMaker endpoint must accept image or point cloud data and return predictions in Avala’s annotation format (see Prediction Response Format below).

Connect in Mission Control

Go to Mission Control > Settings > Inference.
Click Add Provider and select Amazon SageMaker.
Enter the Endpoint Name and Region.
Provide the IAM Role ARN that Avala should assume.
Click Test Connection to verify Avala can invoke the endpoint.
Save the configuration.

Custom HTTP Endpoint Setup

If you are running your own model server (PyTorch Serve, Triton, BentoML, a plain Flask app, etc.), you can connect it directly.

Request Format

Avala sends a POST request to your endpoint with the following JSON body:

{
  "asset_id": "asset_abc123",
  "asset_url": "https://signed-url-to-the-image-or-pointcloud",
  "asset_type": "image",
  "width": 1920,
  "height": 1080,
  "labels": ["car", "pedestrian", "cyclist", "truck"]
}

Field	Type	Description
`asset_id`	string	Unique identifier for the asset.
`asset_url`	string	Signed URL to download the asset. Valid for 1 hour.
`asset_type`	string	One of `image`, `point_cloud`, or `video_frame`.
`width`	integer	Width in pixels (images and video frames only).
`height`	integer	Height in pixels (images and video frames only).
`labels`	string[]	The label set configured for the project. Your model should return predictions using these labels.

Prediction Response Format

Your endpoint must return a JSON response with an annotations array:

{
  "annotations": [
    {
      "type": "bounding_box",
      "label": "car",
      "confidence": 0.94,
      "coordinates": {
        "x": 120,
        "y": 340,
        "width": 200,
        "height": 150
      }
    },
    {
      "type": "polygon",
      "label": "pedestrian",
      "confidence": 0.87,
      "points": [
        [450, 200],
        [470, 200],
        [480, 350],
        [440, 350]
      ]
    }
  ]
}

Connect in Mission Control

Go to Mission Control > Settings > Inference.
Click Add Provider and select Custom HTTP Endpoint.
Enter the Endpoint URL (must be HTTPS).
Optionally configure Authentication (Bearer token or custom header).
Set the Timeout (default: 30 seconds).
Click Test Connection to verify Avala can reach the endpoint.
Save the configuration.

Supported Prediction Types

Type	Key	Description
Bounding Box	`bounding_box`	Axis-aligned rectangle defined by `x`, `y`, `width`, `height`.
Polygon	`polygon`	Closed polygon defined by an ordered array of `[x, y]` points.
Segmentation Mask	`segmentation_mask`	Pixel-level mask as a run-length encoded (RLE) string or a URL to a PNG mask image.
Classification	`classification`	Asset-level or frame-level label with a confidence score.
3D Bounding Box	`bounding_box_3d`	Cuboid in 3D space defined by `center` (`x`, `y`, `z`), `dimensions` (`l`, `w`, `h`), and `rotation` (quaternion).
Polyline	`polyline`	Open polyline defined by an ordered array of `[x, y]` points. Used for lanes, edges, etc.

Auto-Labeling Workflow

Once a provider is connected, you can use it to pre-annotate tasks:

Select a project in Mission Control and open Settings > Auto-Label.
Choose the Inference Provider to use.
Configure the Confidence Threshold. Predictions below this threshold are discarded (default: 0.5).
Click Run Auto-Label to send all unlabeled assets in the project to the model.
Avala displays the predictions as pre-annotations in the labeling editor.
Annotators review each prediction — they can accept it as-is, adjust it, or delete it.
Once reviewed, the task is submitted normally through the project workflow.

You can also trigger auto-labeling via the API:

curl -X POST https://server.avala.ai/api/v1/projects/{project_id}/auto-label \
  -H "X-Avala-Api-Key: your-api-key" \
  -H "Content-Type: application/json" \
  -d '{"provider_id": "inf_abc123", "confidence_threshold": 0.6}'

Model predictions are always treated as suggestions. Every prediction must be reviewed and either accepted or corrected by a human annotator before it becomes a final annotation. This ensures your labeled data meets quality standards even when using AI assistance.

Integrations

​Supported Providers

​Amazon SageMaker Setup

​IAM Role

​Endpoint Configuration

​Connect in Mission Control

​Custom HTTP Endpoint Setup

​Request Format

​Prediction Response Format

​Connect in Mission Control

​Supported Prediction Types

​Auto-Labeling Workflow

Supported Providers

Amazon SageMaker Setup

IAM Role

Endpoint Configuration

Connect in Mission Control

Custom HTTP Endpoint Setup

Request Format

Prediction Response Format

Connect in Mission Control

Supported Prediction Types

Auto-Labeling Workflow