Batch Auto-Labeling - Avala Documentation

Run SAM or YOLO inference across all unlabeled items in a project to generate draft annotations. Review and refine the results instead of labeling from scratch.

Overview

Batch auto-labeling takes a project, runs a model against every unlabeled item, and creates draft annotations that annotators can accept, edit, or discard. This dramatically reduces manual labeling time for projects where a pre-trained model can provide a reasonable starting point.

Start batch job
      │
      ▼
Fetch unlabeled items
      │
      ▼
Run inference per item ──► Below threshold? → Skip
      │
      ▼
Create draft result
      │
      ▼
Emit webhook event

Quickstart

import httpx
import time

# Start a batch auto-label job
resp = httpx.post(
    "https://api.avala.ai/api/v1/projects/proj_abc123/auto-label/",
    headers={"X-Avala-Api-Key": "avk_..."},
    json={
        "model_type": "yolo",
        "confidence_threshold": 0.6,
        "dry_run": False,
    },
)
job = resp.json()
print(f"Job started: {job['uid']}")

# Poll for progress
while job["status"] in ("pending", "running"):
    time.sleep(5)
    resp = httpx.get(
        f"https://api.avala.ai/api/v1/auto-label-jobs/{job['uid']}/",
        headers={"X-Avala-Api-Key": "avk_..."},
    )
    job = resp.json()
    print(f"Progress: {job['progress_pct']}% ({job['processed_items']}/{job['total_items']})")

print(f"Done! {job['successful_items']} items labeled, {job['skipped_items']} skipped")

API Reference

Create Job

POST /api/v1/projects/{project_uid}/auto-label/

Parameter	Type	Default	Description
`model_type`	`string`	`"yolo"`	Inference model: `"sam3"` or `"yolo"`.
`confidence_threshold`	`float`	`0.5`	Minimum confidence to accept a prediction (0.0-1.0).
`labels`	`string[]`	`[]`	Filter to specific labels. Empty = all labels.
`dry_run`	`boolean`	`false`	Run inference without creating results.

Response: 202 Accepted

{
  "uid": "a1b2c3d4-...",
  "status": "pending",
  "model_type": "yolo",
  "confidence_threshold": 0.6,
  "total_items": 0,
  "processed_items": 0,
  "successful_items": 0,
  "failed_items": 0,
  "skipped_items": 0,
  "progress_pct": 0.0,
  "created_at": "2026-02-23T10:30:00Z"
}

Get Job Status

GET /api/v1/auto-label-jobs/{uid}/

Returns the current state of the job including progress counters.

List Jobs

GET /api/v1/auto-label-jobs/?project={project_uid}&status={status}

Parameter	Type	Description
`project`	`string?`	Filter by project UID.
`status`	`string?`	Filter by status: `pending`, `running`, `completed`, `failed`, `cancelled`.

Cancel Job

DELETE /api/v1/auto-label-jobs/{uid}/

Cancels a running or pending job. Items already processed are kept.

Models

YOLO (Object Detection)

Best for: Detecting and labeling objects with bounding boxes.

Generates bounding box annotations
Works well for common object categories
Fast inference (~50ms per image)

SAM (Segmentation)

Best for: Precise object boundaries and segmentation masks.

Generates segmentation annotations
Better boundary accuracy than bounding boxes
Slower inference (~200ms per image)

Configuration Guide

Confidence Threshold

The confidence_threshold controls how many predictions are accepted:

Threshold	Effect
`0.3`	More predictions, more noise. Good for initial exploration.
`0.5`	Balanced. Recommended starting point.
`0.7`	Fewer predictions, higher quality. Good for production.
`0.9`	Only very confident predictions. Minimal false positives.

Dry Run

Use dry_run: true to preview what the model would label without creating any results:

curl -X POST .../projects/proj_abc123/auto-label/ \
  -H "X-Avala-Api-Key: avk_..." \
  -d '{"model_type": "yolo", "confidence_threshold": 0.5, "dry_run": true}'

Check the successful_items count to see how many items would be labeled.

Webhook Events

Auto-label jobs emit webhook events on completion:

Event	Trigger
`auto_label.completed`	Job finished successfully
`auto_label.failed`	Job failed with an error

{
  "event": "auto_label.completed",
  "payload": {
    "job_uid": "a1b2c3d4-...",
    "project_uid": "proj_abc123",
    "status": "completed",
    "total_items": 500,
    "successful": 423,
    "failed": 2,
    "skipped": 75,
    "dry_run": false
  }
}

Limitations

Only one auto-label job can run per project at a time
Maximum 5,000 items per job
Requires a running SageMaker inference endpoint
Draft results are attributed to the Avala bot user
Only items without existing results are processed

Next Steps

Supported Models for details on SAM and YOLO
Agent Framework for automated QA after auto-labeling
Webhooks to trigger pipelines on job completion
Inference for single-item interactive inference

Integrations

​Overview

​Quickstart

​API Reference

​Create Job

​Get Job Status

​List Jobs

​Cancel Job

​Models

​YOLO (Object Detection)

​SAM (Segmentation)

​Configuration Guide

​Confidence Threshold

​Dry Run

​Webhook Events

​Limitations

​Next Steps

Overview

Quickstart

API Reference

Create Job

Get Job Status

List Jobs

Cancel Job

Models

YOLO (Object Detection)

SAM (Segmentation)

Configuration Guide

Confidence Threshold

Dry Run

Webhook Events

Limitations

Next Steps