Check your current limits
View your account’s current quotas and limits:Spending tiers
Your account tier determines the maximum budget you can set:| Tier | Criteria | Max Monthly Budget |
|---|---|---|
| Tier 1 | Valid payment method | $50 |
| Tier 2 | Spend or add $50 in credits | $500 |
| Tier 3 | Spend or add $500 in credits | $5,000 |
| Tier 4 | Spend or add $5,000 in credits | $50,000 |
| Unlimited | Contact us | Unlimited |
Manage your quotas
Budget control
Budget control
Control your monthly spending with flexible budget limits. Set a limit that fits your needs and adjust it anytime.Set a custom monthly budget:For example, to set a $200 monthly budget:
View and adjust your spend limit
Check your current spend limit:When you reach your budget
When you reach your spending limit, all API requests pause automatically across serverless inference, deployments, and fine-tuning. To resume, add credits to increase your tier and set a higher budget.On-demand deployment quotas
On-demand deployment quotas
On-demand deployments have GPU quotas instead of rate limits:
| GPU Type | Default Quota |
|---|---|
| Nvidia A100 | 8 GPUs |
| Nvidia H100 | 8 GPUs |
| Nvidia H200 | 8 GPUs |
| GPU hours/month | 2,000 |
| LoRAs (on-demand and serverless) | 100 |
Serverless rate limits
Serverless rate limits
Default limits
All accounts with a payment method get these limits:| Limit | Value |
|---|---|
| Requests per minute (RPM) | 6,000 |
| Audio min per minute, Whisper-v3-large | 200 |
| Audio min per minute, Whisper-v3-turbo | 400 |
| Concurrent connections, streaming speech | 10 |
| LoRAs (on-demand and serverless) | 100 |
How rate limiting works
Dynamic rate limits support high RPM limits in a fair manner, while limiting spiky traffic from impacting other users:- Gradual scaling: Your minimum limits increase as you sustain consistent high usage
- Typical scale-up: Traffic can typically double within an hour without issues
- Burst handling: Short traffic spikes are accommodated during autoscaling
- Check response headers to see your current limits and remaining capacity
x-ratelimit-limit-requests: Your current minimum limitx-ratelimit-remaining-requests: Remaining capacityx-ratelimit-over-limit: yes: Your request was processed but you’re near capacity
Account recovery
Account recovery
If your account is suspended due to failed payment:
- Go to Billing → Invoices
- Pay any outstanding invoices
- Your account reactivates automatically within an hour