Flat monthly access
Unlimited requests, with honest technical limits.
The paid plan has no monthly request or token quota. Concurrency, rate, context, and acceptable-use protections keep the shared service healthy.
What unlimited means
InferencePass does not decrement credits or stop service after a monthly token allowance. Eligible requests continue throughout an active subscription.
Service protections
Each paid key can run up to five requests concurrently and 120 requests per minute. Individual requests are limited to 128K input and 64K output tokens.
Automated scraping, credential sharing, resale, denial-of-service behavior, and attempts to bypass protections are not acceptable use.
Routing and failover
When the selected provider fails before streaming begins, the gateway can try one healthy alternate route. Once output has begun, it will not replay the request through another provider.