Rate Limits
Rate limits protect API stability for all users. Limits are enforced per API key and per IP address on public endpoints.
Limits by Plan
| Plan | Requests / min | Monthly Quota |
|---|---|---|
| Free | 10 | 100 |
| Starter | 50 | 2,000 |
| Growth | 200 | 10,000 |
| Enterprise | 1,000 | Custom |
IP-Based Limits (Public Endpoints)
Auth endpoints (/auth/signup, /auth/login) have separate IP-based rate limits to prevent abuse:
| Endpoint | Limit | Window |
|---|---|---|
/auth/signup | 5 requests | 5 minutes per IP |
/auth/login | 10 requests | 1 minute per IP |
Rate Limit Headers
When a rate limit is exceeded, the API returns HTTP 429 with a Retry-After header indicating when to retry.
HTTP/1.1 429 Too Many Requests
Retry-After: 60
Content-Type: application/json
{
"detail": {
"error": "rate_limit_exceeded",
"message": "Too many requests from your IP. Try again in 60 seconds.",
"retry_after": 60
}
}Best Practices
Implement exponential backoff
On 429 responses, wait for the Retry-After duration then retry with increasing delays.
Use the batch endpoint
Instead of sending 50 individual requests, use /v1/batch to process them in one call — counts as 50 quota units but only 1 rate-limit unit.
Cache results when possible
Identical documents return the same output. Cache extraction results keyed by file hash to avoid redundant API calls.
Monitor your usage
Poll /v1/usage periodically and alert before reaching 80% of your monthly quota.