Skip to main content
A well-designed label taxonomy is the foundation of any annotation project. This page covers how to structure your object classes, configure attributes, and build hierarchies that produce consistent, high-quality training data.

What Is an Ontology?

In the context of data annotation, an ontology (or label taxonomy) is the complete schema of classes, attributes, and relationships that annotators use to label data. It defines:
  • What objects to label (object classes)
  • How to describe them (attributes and properties)
  • How classes relate to each other (hierarchy and grouping)
A clear ontology reduces annotator confusion, improves inter-annotator agreement, and produces cleaner training data for your models.

Object Classes

Object classes are the core building blocks of your taxonomy. Each class represents a category of object that annotators will identify and label in the data.

Defining Classes

When creating a project in Avala, you define your label config as a list of classes:
{
  "labels": [
    { "name": "car", "color": "#FF0000" },
    { "name": "pedestrian", "color": "#00FF00" },
    { "name": "cyclist", "color": "#0000FF" },
    { "name": "truck", "color": "#FFA500" },
    { "name": "bus", "color": "#800080" }
  ]
}
Each class has a unique name and a display color used in the annotation editor. Choose colors that are visually distinct from each other and from common background colors in your data.

Class Naming Best Practices

PracticeExampleWhy
Use lowercase, specific namessedan, pickup_truckReduces ambiguity
Avoid overlapping definitionsDo not have both car and vehicle at the same levelPrevents annotator confusion
Be consistent with separatorstraffic_light not traffic-light or trafficLightConsistent parsing in training pipelines
Include negative/background classes only if neededunknown, ignore_regionSome models require explicit background labels

Attributes

Attributes add structured metadata to each annotation beyond the object class. They let annotators describe properties like visibility, pose, or condition.

Attribute Types

Avala supports several attribute types that you can attach to any object class:
TypeDescriptionWhen to UseExample
DropdownSingle selection from a predefined listMutually exclusive optionsOcclusion: none, partial, heavy
CheckboxBoolean toggleSimple yes/no flagsis_parked: true/false
TextFree-form string inputUnique identifiers or descriptionsLicense plate number
NumberNumeric valueMeasurements or countsEstimated distance in meters
Multi-selectMultiple selections from a listConcurrent, non-exclusive statesVisible: headlights, taillights, turn_signal
Attribute Types Comparison

Configuring Attributes

Attributes are defined in the project’s classification config alongside the label config:
{
  "labels": [
    { "name": "car", "color": "#FF0000" }
  ],
  "classification": {
    "attributes": [
      {
        "name": "occlusion",
        "type": "dropdown",
        "options": ["none", "partial", "heavy"],
        "required": true,
        "applies_to": ["car", "pedestrian", "cyclist"]
      },
      {
        "name": "is_parked",
        "type": "checkbox",
        "required": false,
        "applies_to": ["car", "truck", "bus"]
      },
      {
        "name": "truncation",
        "type": "dropdown",
        "options": ["none", "partial", "heavy"],
        "required": true,
        "applies_to": ["car", "pedestrian", "cyclist"]
      }
    ]
  }
}

Conditional Attributes

Use the applies_to field to show attributes only for relevant classes. This keeps the annotator’s interface clean — a pedestrian does not need an is_parked attribute, and a traffic_light does not need truncation.
Mark attributes as required for properties that your model training pipeline depends on. Leave optional attributes for supplementary metadata that is useful but not critical.

Hierarchical Taxonomies

For complex domains, flat class lists become unwieldy. Hierarchical taxonomies group related classes under parent categories.

Example: Vehicle Taxonomy

Vehicle
├── Car
│   ├── Sedan
│   ├── SUV
│   └── Hatchback
├── Truck
│   ├── Pickup
│   └── Semi
├── Bus
│   ├── City Bus
│   └── School Bus
└── Motorcycle

When to Use Hierarchies

ScenarioRecommendation
Fewer than 15 classesFlat list is simpler and faster
15-50 classesGroup into 3-5 top-level categories
50+ classesUse multi-level hierarchy with search
Classes share attributesGroup under parent so attributes inherit

Designing Hierarchies

  1. Start broad, then refine. Begin with top-level categories (vehicle, pedestrian, infrastructure) and add specificity only where your model needs it.
  2. Every leaf class should be unambiguous. If annotators cannot reliably distinguish between two subclasses, merge them.
  3. Balance depth and breadth. Deep hierarchies (4+ levels) slow annotators down. Prefer wider trees with 2-3 levels.

Single-Label vs Multi-Label Classification

Avala supports both classification modes depending on your project needs.

Single-Label

Each object or scene receives exactly one class label. This is the default for most annotation types.
  • Object detection: Each bounding box gets one class
  • Scene classification: Each image gets one category

Multi-Label

An object or scene can receive multiple labels simultaneously. Use this when categories are not mutually exclusive.
  • An image can be both rainy and nighttime
  • A vehicle can be both damaged and parked
Configure multi-label classification in your project’s classification config by setting the task-level classification type:
{
  "classification": {
    "type": "multi-label",
    "categories": [
      { "name": "weather", "options": ["clear", "rainy", "foggy", "snowy"] },
      { "name": "time_of_day", "options": ["daytime", "nighttime", "dawn", "dusk"] },
      { "name": "road_condition", "options": ["dry", "wet", "icy"] }
    ]
  }
}

Ontology Design Checklist

Before starting your annotation project, verify your ontology against this checklist:
CheckQuestion
CompletenessDoes every object your model needs to detect have a class?
Mutual exclusivityCan an annotator always assign exactly one class without ambiguity?
Attribute coverageAre all properties needed for training captured as attributes?
Consistent granularityAre classes at the same level equally specific?
Annotator clarityCan a new annotator understand each class from its name alone?
Model alignmentDoes the taxonomy match what your model architecture expects?
ScalabilityCan you add new classes later without restructuring?

Common Pitfalls

Over-Specifying Classes

Creating too many fine-grained classes leads to low inter-annotator agreement and sparse training data per class. Problem: 50 vehicle subclasses where most have fewer than 100 examples each. Solution: Start with 5-10 broad classes. Add subclasses only when you have enough data and your model benefits from the distinction.

Ambiguous Boundaries

When two classes overlap conceptually, annotators will disagree on which to use. Problem: Both van and minivan exist, but annotators cannot reliably distinguish them. Solution: Either merge them into a single class or provide explicit visual guidelines with reference images showing the boundary.

Missing Edge Cases

Real-world data contains objects that do not fit neatly into your taxonomy. Problem: An annotator encounters a golf cart but the taxonomy only has car, truck, and motorcycle. Solution: Include a catch-all class like other_vehicle and review items labeled with it periodically to identify classes you need to add.

Next Steps