OpenAI-compatible inference for developers

One API pass.
Unlimited requests.
Five dollars.

Route requests across healthy AI providers, use a built-in web chat, and stop calculating token bills before every experiment.

OpenAI-compatible Live routing Prompt-free API logs
Live routing receiptConnecting
1 Request
POST /v1/chat/completions
Model: auto
Stream: true
2 Route evaluation
Waiting for the live provider catalog.
AutoBest healthy route
FastLowest latency route
ReasoningHighest quality route
3 Selected provider
Selection begins when providers report healthy.
4 Streamed response
Gateway health is being established.
Live API example · cURL
curl https://api.inferencepass.com/v1/chat/completions \
  -H "Authorization: Bearer $INFERENCEPASS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "auto",
    "messages": [{"role":"user","content":"Explain routing simply."}],
    "stream": true
  }'
Response · streamed
data: {"id":"chatcmpl_...","choices":[{"delta":{"role":"assistant"}}]}
data: {"id":"chatcmpl_...","choices":[{"delta":{"content":"A routed API "}}]}
data: {"id":"chatcmpl_...","choices":[{"delta":{"content":"selects the healthiest provider."}}]}
data: {"id":"chatcmpl_...","choices":[{"delta":{},"finish_reason":"stop"}]}
data: [DONE]

Stable public selectors

Three modes. One simple choice.

01

Auto

The best healthy route for reliability and simplicity.

  • Balanced latency
  • Provider failover
  • Free-tier default
02

Fast

The lowest-latency compatible provider for realtime work.

  • Latency-prioritized
  • Streaming friendly
  • Paid plan
03

Reasoning

Higher-quality routes for analysis, coding, and complex tasks.

  • Quality-prioritized
  • Tool-aware routing
  • Paid plan

One account

API and chat. All in one workspace.

Create keys, inspect request metadata, manage your subscription, and explore the routing engine in the included web chat.

New chatMode: Auto · Live routing

API access and chat, together.

Test the same routing modes your application uses. Start a conversation, inspect the selected provider, then move the prompt into your code.

Explain an unfamiliar codebaseDesign a JSON schemaPlan an API migration

Transparent pricing

No calculator. No surprise invoice.

Free

$0 / month
  • 25 verified requests per day
  • Auto routing mode
  • API access and web chat
  • No credit card required
Create free key

Verified June 15, 2026

InferencePass vs metered routing.

FeatureInferencePassLLM7OpenRouter
Paid price$5/month fixed$12/month ProMetered by model
Request quotaUnlimitedPublished rate limitsVaries
Token quotaUnlimited5B/day on ProPaid token usage
API and chatOne accountSeparate productsIncluded
Routing modesAuto, Fast, ReasoningDefault, Fast, ProProvider/model routing

Competitor details are sourced from LLM7 pricing and OpenRouter pricing. Limits and prices can change; check their current pages before purchasing.

Drop-in setup

Integrate in minutes.

cURL
curl https://api.inferencepass.com/v1/chat/completions \
 -H "Authorization: Bearer $INFERENCEPASS_API_KEY"
Python
from openai import OpenAI
client = OpenAI(
  api_key=os.environ["INFERENCEPASS_API_KEY"],
  base_url="https://api.inferencepass.com/v1"
)
JavaScript
const client = new OpenAI({
  apiKey: process.env.INFERENCEPASS_API_KEY,
  baseURL: "https://api.inferencepass.com/v1"
});

Common questions

The important details, plainly.

Is InferencePass OpenAI-compatible?

Yes. The v1 launch supports chat completions, streaming, JSON response modes, system messages, and OpenAI-style tools and function calls.

What does Unlimited include?

Unlimited includes API requests, routed model modes, and browser chat without a monthly request or token allowance. Service-level concurrency, burst, context, and acceptable-use protections still apply.

Is web chat included?

Yes. Free and Unlimited accounts can use the browser chat. It calls the same gateway and routing modes available through the API.

Does the API store prompts?

No. The API gateway records operational metadata such as provider, latency, and token counts, but it does not persist API prompts or generated responses.

ONE API PASS$5MONTHLY

Start for free. Ship with confidence.

Get a free API key in seconds. No credit card required.