A well-designed label taxonomy is the foundation of any annotation project. This page covers how to structure your object classes, configure attributes, and build hierarchies that produce consistent, high-quality training data.
What Is an Ontology?
In the context of data annotation, an ontology (or label taxonomy) is the complete schema of classes, attributes, and relationships that annotators use to label data. It defines:
- What objects to label (object classes)
- How to describe them (attributes and properties)
- How classes relate to each other (hierarchy and grouping)
A clear ontology reduces annotator confusion, improves inter-annotator agreement, and produces cleaner training data for your models.
Object Classes
Object classes are the core building blocks of your taxonomy. Each class represents a category of object that annotators will identify and label in the data.
Defining Classes
When creating a project in Avala, you define your label config as a list of classes:
{
"labels": [
{ "name": "car", "color": "#FF0000" },
{ "name": "pedestrian", "color": "#00FF00" },
{ "name": "cyclist", "color": "#0000FF" },
{ "name": "truck", "color": "#FFA500" },
{ "name": "bus", "color": "#800080" }
]
}
Each class has a unique name and a display color used in the annotation editor. Choose colors that are visually distinct from each other and from common background colors in your data.
Class Naming Best Practices
| Practice | Example | Why |
|---|
| Use lowercase, specific names | sedan, pickup_truck | Reduces ambiguity |
| Avoid overlapping definitions | Do not have both car and vehicle at the same level | Prevents annotator confusion |
| Be consistent with separators | traffic_light not traffic-light or trafficLight | Consistent parsing in training pipelines |
| Include negative/background classes only if needed | unknown, ignore_region | Some models require explicit background labels |
Attributes
Attributes add structured metadata to each annotation beyond the object class. They let annotators describe properties like visibility, pose, or condition.
Attribute Types
Avala supports several attribute types that you can attach to any object class:
| Type | Description | When to Use | Example |
|---|
| Dropdown | Single selection from a predefined list | Mutually exclusive options | Occlusion: none, partial, heavy |
| Checkbox | Boolean toggle | Simple yes/no flags | is_parked: true/false |
| Text | Free-form string input | Unique identifiers or descriptions | License plate number |
| Number | Numeric value | Measurements or counts | Estimated distance in meters |
| Multi-select | Multiple selections from a list | Concurrent, non-exclusive states | Visible: headlights, taillights, turn_signal |
Configuring Attributes
Attributes are defined in the project’s classification config alongside the label config:
{
"labels": [
{ "name": "car", "color": "#FF0000" }
],
"classification": {
"attributes": [
{
"name": "occlusion",
"type": "dropdown",
"options": ["none", "partial", "heavy"],
"required": true,
"applies_to": ["car", "pedestrian", "cyclist"]
},
{
"name": "is_parked",
"type": "checkbox",
"required": false,
"applies_to": ["car", "truck", "bus"]
},
{
"name": "truncation",
"type": "dropdown",
"options": ["none", "partial", "heavy"],
"required": true,
"applies_to": ["car", "pedestrian", "cyclist"]
}
]
}
}
Conditional Attributes
Use the applies_to field to show attributes only for relevant classes. This keeps the annotator’s interface clean — a pedestrian does not need an is_parked attribute, and a traffic_light does not need truncation.
Mark attributes as required for properties that your model training pipeline depends on. Leave optional attributes for supplementary metadata that is useful but not critical.
Hierarchical Taxonomies
For complex domains, flat class lists become unwieldy. Hierarchical taxonomies group related classes under parent categories.
Example: Vehicle Taxonomy
Vehicle
├── Car
│ ├── Sedan
│ ├── SUV
│ └── Hatchback
├── Truck
│ ├── Pickup
│ └── Semi
├── Bus
│ ├── City Bus
│ └── School Bus
└── Motorcycle
When to Use Hierarchies
| Scenario | Recommendation |
|---|
| Fewer than 15 classes | Flat list is simpler and faster |
| 15-50 classes | Group into 3-5 top-level categories |
| 50+ classes | Use multi-level hierarchy with search |
| Classes share attributes | Group under parent so attributes inherit |
Designing Hierarchies
- Start broad, then refine. Begin with top-level categories (
vehicle, pedestrian, infrastructure) and add specificity only where your model needs it.
- Every leaf class should be unambiguous. If annotators cannot reliably distinguish between two subclasses, merge them.
- Balance depth and breadth. Deep hierarchies (4+ levels) slow annotators down. Prefer wider trees with 2-3 levels.
Single-Label vs Multi-Label Classification
Avala supports both classification modes depending on your project needs.
Single-Label
Each object or scene receives exactly one class label. This is the default for most annotation types.
- Object detection: Each bounding box gets one class
- Scene classification: Each image gets one category
Multi-Label
An object or scene can receive multiple labels simultaneously. Use this when categories are not mutually exclusive.
- An image can be both
rainy and nighttime
- A vehicle can be both
damaged and parked
Configure multi-label classification in your project’s classification config by setting the task-level classification type:
{
"classification": {
"type": "multi-label",
"categories": [
{ "name": "weather", "options": ["clear", "rainy", "foggy", "snowy"] },
{ "name": "time_of_day", "options": ["daytime", "nighttime", "dawn", "dusk"] },
{ "name": "road_condition", "options": ["dry", "wet", "icy"] }
]
}
}
Ontology Design Checklist
Before starting your annotation project, verify your ontology against this checklist:
| Check | Question |
|---|
| Completeness | Does every object your model needs to detect have a class? |
| Mutual exclusivity | Can an annotator always assign exactly one class without ambiguity? |
| Attribute coverage | Are all properties needed for training captured as attributes? |
| Consistent granularity | Are classes at the same level equally specific? |
| Annotator clarity | Can a new annotator understand each class from its name alone? |
| Model alignment | Does the taxonomy match what your model architecture expects? |
| Scalability | Can you add new classes later without restructuring? |
Common Pitfalls
Over-Specifying Classes
Creating too many fine-grained classes leads to low inter-annotator agreement and sparse training data per class.
Problem: 50 vehicle subclasses where most have fewer than 100 examples each.
Solution: Start with 5-10 broad classes. Add subclasses only when you have enough data and your model benefits from the distinction.
Ambiguous Boundaries
When two classes overlap conceptually, annotators will disagree on which to use.
Problem: Both van and minivan exist, but annotators cannot reliably distinguish them.
Solution: Either merge them into a single class or provide explicit visual guidelines with reference images showing the boundary.
Missing Edge Cases
Real-world data contains objects that do not fit neatly into your taxonomy.
Problem: An annotator encounters a golf cart but the taxonomy only has car, truck, and motorcycle.
Solution: Include a catch-all class like other_vehicle and review items labeled with it periodically to identify classes you need to add.
Next Steps