What are Usage Tiers?
Usage tiers determine your rate limits and access to beta features based on your cumulative API spending. As you spend more on API credits over time, you automatically advance to higher tiers with increased rate limits.You can check your current usage tier by visiting your API settings page.
Tier Progression
Tier | Total Credits Purchased | Status |
---|---|---|
Tier 0 | $0 | New accounts, limited access |
Tier 1 | $50+ | Light usage, basic limits |
Tier 2 | $250+ | Regular usage |
Tier 3 | $500+ | Heavy usage |
Tier 4 | $1,000+ | Production usage |
Tier 5 | $5,000+ | Enterprise usage |
Tiers are based on cumulative purchases across your account lifetime, not current balance.
How Tiers Work
- Automatic advancement - Tiers increase based on your total lifetime credit purchases
- Rate limit increases - Higher tiers get significantly more requests per minute
- Permanent status - Once you reach a tier, you keep it (no downgrade)
Rate Limits by Model
- Tier 0
- Tier 1
- Tier 2
- Tier 3
- Tier 4
- Tier 5
Model | Requests per minute (RPM) |
---|---|
sonar-deep-research | 5 |
sonar-reasoning-pro | 50 |
sonar-reasoning | 50 |
sonar-pro | 50 |
sonar | 50 |
POST /async/chat/completions | 5 |
GET /async/chat/completions | 3000 |
GET /async/chat/completions/{request_id} | 6000 |
Search Rate Limits
The Search API has separate rate limits that apply to all usage tiers:Endpoint | Rate Limit | Burst Capacity |
---|---|---|
POST /search | 3 requests per second | 3 requests |
- Burst: Can handle 3 requests instantly
- Refill: 1 token every 333ms
- Sustained: Exactly 3 QPS average over time
Search rate limits are independent of your usage tier and apply consistently across all accounts using the same leaky bucket algorithm.
Need Higher Search Rate Limits?If you require increased rate limits for the Search API beyond the standard 3 requests per second, please fill out our rate limit increase request form. We’ll review your use case and work with you to accommodate your needs.
How Rate Limiting Works
Our rate limiting system uses a leaky bucket algorithm that allows for burst traffic while maintaining strict long-term rate control.Technical Implementation
Leaky Bucket Algorithm Explained
Leaky Bucket Algorithm Explained
The leaky bucket algorithm works like a bucket with a small hole in the bottom:
- Bucket Capacity: Maximum number of requests you can make instantly (burst capacity)
- Leak Rate: How quickly tokens refill over time (your rate limit)
- Token Refill: New requests become available at regular intervals
- ✅ Allows legitimate burst traffic
- ✅ Prevents sustained abuse
- ✅ Predictable and fair rate enforcement
Rate Limiter Behavior Example
Rate Limiter Behavior Example
Let’s examine how 3 requests per second works in practice:Parameters:Scenario 2: Steady 3 QPSScenario 3: Slightly Over 3 QPS
- Capacity: 3 tokens
- Leak rate: 3 tokens/second
- Refill: 1 token every 333ms
Real-World Implications
Real-World Implications
What this means for your applications:✅ Burst Tolerance:
- Can handle your full rate limit instantly
- Perfect for batch operations or sudden traffic spikes
- No need to artificially spread requests
- Strict average rate enforcement over time
- Quick recovery after burst usage
- Consistent performance across different usage patterns
- Prevents sustained over-limit usage
- Blocks excessive burst attempts
- Maintains fair resource allocation
- Take advantage of burst capacity for batch operations
- Monitor your usage patterns to optimize request timing
- Implement proper error handling for 429 responses
What Happens When You Hit Rate Limits?
When you exceed your rate limits:- 429 Error - Your request gets rejected with “Too Many Requests”
- Continuous Refill - Tokens refill continuously based on your rate limit
- Immediate Recovery - New requests become available as soon as tokens refill
- 3 QPS limit: 1 token refills every 333ms
- 50 QPS limit: 1 token refills every 20ms
- 500 QPS limit: 1 token refills every 2ms
Best Practices:
- Monitor your usage to predict when you’ll need higher tiers
- Consider upgrading your tier proactively for production applications
- Implement exponential backoff with jitter in your code
- Take advantage of burst capacity for batch operations
- Don’t artificially spread requests if you have available burst capacity
Upgrading Your Tier
1
Check Current Tier
Visit your API settings page to see your current tier and total spending.
2
Purchase More Credits
Add credits to your account through the billing section. Your tier will automatically upgrade once you reach the spending threshold.
3
Verify Upgrade
Your new rate limits take effect immediately after the tier upgrade. Check your settings page to confirm.
Higher tiers significantly improve your API experience with increased rate limits, especially important for production applications.