Developer documentation
A small API surface on purpose.
InferencePass v1 focuses on OpenAI-compatible chat completions, model discovery, streaming, JSON output, and tools.
Base URL and authentication
Send bearer-authenticated requests to https://api.inferencepass.com/v1. API keys begin with ip_live_ and are displayed once when created.
First request
curl https://api.inferencepass.com/v1/chat/completions \
-H "Authorization: Bearer $INFERENCEPASS_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "auto",
"messages": [{"role": "user", "content": "Hello"}]
}'Supported in v1
The launch API supports system messages, temperature, max_tokens, streaming, JSON response formats, and OpenAI-style tool definitions and tool calls.
- GET /v1/models
- POST /v1/chat/completions
- Server-sent event streaming
Deliberately out of scope
Images, audio, embeddings, fine-tuning, and the Responses API are not part of v1. Unsupported paths return a clear OpenAI-shaped error rather than silently changing behavior.