Skip to main content
Run SAM or YOLO inference across all unlabeled items in a project to generate draft annotations. Review and refine the results instead of labeling from scratch.

Overview

Batch auto-labeling takes a project, runs a model against every unlabeled item, and creates draft annotations that annotators can accept, edit, or discard. This dramatically reduces manual labeling time for projects where a pre-trained model can provide a reasonable starting point.
Start batch job


Fetch unlabeled items


Run inference per item ──► Below threshold? → Skip


Create draft result


Emit webhook event

Quickstart

import httpx
import time

# Start a batch auto-label job
resp = httpx.post(
    "https://api.avala.ai/api/v1/projects/proj_abc123/auto-label/",
    headers={"X-Avala-Api-Key": "avk_..."},
    json={
        "model_type": "yolo",
        "confidence_threshold": 0.6,
        "dry_run": False,
    },
)
job = resp.json()
print(f"Job started: {job['uid']}")

# Poll for progress
while job["status"] in ("pending", "running"):
    time.sleep(5)
    resp = httpx.get(
        f"https://api.avala.ai/api/v1/auto-label-jobs/{job['uid']}/",
        headers={"X-Avala-Api-Key": "avk_..."},
    )
    job = resp.json()
    print(f"Progress: {job['progress_pct']}% ({job['processed_items']}/{job['total_items']})")

print(f"Done! {job['successful_items']} items labeled, {job['skipped_items']} skipped")

API Reference

Create Job

POST /api/v1/projects/{project_uid}/auto-label/
ParameterTypeDefaultDescription
model_typestring"yolo"Inference model: "sam3" or "yolo".
confidence_thresholdfloat0.5Minimum confidence to accept a prediction (0.0-1.0).
labelsstring[][]Filter to specific labels. Empty = all labels.
dry_runbooleanfalseRun inference without creating results.
Response: 202 Accepted
{
  "uid": "a1b2c3d4-...",
  "status": "pending",
  "model_type": "yolo",
  "confidence_threshold": 0.6,
  "total_items": 0,
  "processed_items": 0,
  "successful_items": 0,
  "failed_items": 0,
  "skipped_items": 0,
  "progress_pct": 0.0,
  "created_at": "2026-02-23T10:30:00Z"
}

Get Job Status

GET /api/v1/auto-label-jobs/{uid}/
Returns the current state of the job including progress counters.

List Jobs

GET /api/v1/auto-label-jobs/?project={project_uid}&status={status}
ParameterTypeDescription
projectstring?Filter by project UID.
statusstring?Filter by status: pending, running, completed, failed, cancelled.

Cancel Job

DELETE /api/v1/auto-label-jobs/{uid}/
Cancels a running or pending job. Items already processed are kept.

Models

YOLO (Object Detection)

Best for: Detecting and labeling objects with bounding boxes.
  • Generates bounding box annotations
  • Works well for common object categories
  • Fast inference (~50ms per image)

SAM (Segmentation)

Best for: Precise object boundaries and segmentation masks.
  • Generates segmentation annotations
  • Better boundary accuracy than bounding boxes
  • Slower inference (~200ms per image)

Configuration Guide

Confidence Threshold

The confidence_threshold controls how many predictions are accepted:
ThresholdEffect
0.3More predictions, more noise. Good for initial exploration.
0.5Balanced. Recommended starting point.
0.7Fewer predictions, higher quality. Good for production.
0.9Only very confident predictions. Minimal false positives.

Dry Run

Use dry_run: true to preview what the model would label without creating any results:
curl -X POST .../projects/proj_abc123/auto-label/ \
  -H "X-Avala-Api-Key: avk_..." \
  -d '{"model_type": "yolo", "confidence_threshold": 0.5, "dry_run": true}'
Check the successful_items count to see how many items would be labeled.

Webhook Events

Auto-label jobs emit webhook events on completion:
EventTrigger
auto_label.completedJob finished successfully
auto_label.failedJob failed with an error
{
  "event": "auto_label.completed",
  "payload": {
    "job_uid": "a1b2c3d4-...",
    "project_uid": "proj_abc123",
    "status": "completed",
    "total_items": 500,
    "successful": 423,
    "failed": 2,
    "skipped": 75,
    "dry_run": false
  }
}

Limitations

  • Only one auto-label job can run per project at a time
  • Maximum 5,000 items per job
  • Requires a running SageMaker inference endpoint
  • Draft results are attributed to the Avala bot user
  • Only items without existing results are processed

Next Steps