Understanding rate limits and pagination helps you build robust integrations with the Avala API.
Rate Limits
Default Limits
| Endpoint Type | Limit | Window |
|---|
| Standard requests | 100 | per minute |
| Upload requests | 10 | concurrent |
| Export requests | 5 | concurrent |
| Streaming API | 20 | concurrent connections |
The streaming API supports real-time event delivery for annotation updates and export progress. Streaming connections count against the concurrent connection limit, not the per-minute request limit.
All responses include rate limit headers so you can track your usage programmatically.
| Header | Description |
|---|
X-RateLimit-Limit | Maximum requests allowed in the current window |
X-RateLimit-Remaining | Requests remaining in the current window |
X-RateLimit-Reset | Unix timestamp when the window resets |
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 95
X-RateLimit-Reset: 1705312260
Handling Rate Limits
When rate limited, the API returns 429 Too Many Requests:
{
"detail": "Request was throttled. Expected available in 30 seconds."
}
Implement exponential backoff to handle rate limits gracefully:
import time
import requests
def fetch_with_retry(url, headers, max_retries=5):
for attempt in range(max_retries):
response = requests.get(url, headers=headers)
if response.status_code == 429:
retry_after = response.headers.get("Retry-After")
wait_time = int(retry_after) if retry_after else 2 ** attempt
print(f"Rate limited. Retrying in {wait_time}s...")
time.sleep(wait_time)
continue
response.raise_for_status()
return response
raise Exception("Max retries exceeded")
Always respect Retry-After headers when present. Ignoring rate limits may result in your API key being temporarily suspended.
List endpoints use cursor-based pagination. Each response includes a next URL that you follow to retrieve the next page of results.
Query Parameters
| Parameter | Type | Description |
|---|
cursor | string | Pagination cursor from a previous response |
limit | integer | Number of results per page (default varies by endpoint) |
{
"count": 150,
"next": "https://server.avala.ai/api/v1/datasets/johndoe/list/?cursor=cD0yMDI0...",
"previous": null,
"results": [...]
}
| Field | Type | Description |
|---|
count | integer | Total number of results across all pages |
next | string | URL for the next page, or null if this is the last page |
previous | string | URL for the previous page, or null if this is the first page |
results | array | Array of resource objects for the current page |
import requests
BASE_URL = "https://server.avala.ai/api/v1"
headers = {"X-Avala-Api-Key": "YOUR_API_KEY"}
def fetch_all_datasets(owner):
all_datasets = []
url = f"{BASE_URL}/datasets/{owner}/list/"
while url:
response = requests.get(url, headers=headers)
response.raise_for_status()
data = response.json()
all_datasets.extend(data["results"])
url = data.get("next")
return all_datasets
Best Practices
Respect Rate Limits
- Check
X-RateLimit-Remaining before making bursts of requests
- Implement exponential backoff with jitter when limits are reached
- Spread requests over time for bulk operations instead of sending them all at once
- Use the
next URL directly rather than constructing cursor values manually
- Process results as you paginate instead of loading everything into memory
- Set a reasonable
limit parameter to balance between fewer requests and smaller payloads
Caching
- Cache responses for resources that change infrequently (e.g., dataset metadata, project configurations)
- Use the
updated_at timestamp to determine when cached data is stale
- Avoid caching paginated list responses since the underlying data may change between requests
Concurrent Requests
- Stay within concurrent connection limits for uploads (10) and exports (5)
- Use a semaphore or connection pool to manage concurrent requests in your application
- Queue requests that exceed concurrency limits rather than dropping them