POST /v1/chat/completions

OpenAI Chat Completions-compatible API. The gateway translates between Chat Completions wire shape and the canonical internal representation; modules see the canonical form, your client sees OpenAI shape.

Endpoint

POST https://api.prxy.monster/v1/chat/completions

Headers

Header	Required	Notes
`Authorization: Bearer <key>`	yes	Your `prxy_live_xxx` key.
`Content-Type: application/json`	yes
`x-prxy-pipe`	no	Per-request pipeline override.

Request body

{
  model: string;                    // any provider's model name
  messages: Array<{
    role: 'system' | 'user' | 'assistant' | 'tool';
    content: string | ContentPart[];
    name?: string;
    tool_calls?: ToolCall[];
    tool_call_id?: string;
  }>;
  max_tokens?: number;
  temperature?: number;
  top_p?: number;
  stream?: boolean;
  stop?: string | string[];
  tools?: Array<{
    type: 'function';
    function: { name: string; description?: string; parameters: object };
  }>;
  tool_choice?: 'none' | 'auto' | { type: 'function'; function: { name: string } };
  response_format?: { type: 'text' | 'json_object' };
  seed?: number;
}

System message hoisting: the first role: 'system' message gets hoisted to the canonical system field. Subsequent system messages are inlined into the surrounding user/assistant turn (preserving intent across translators).

Response (non-streaming)

{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "created": 1714200000,
  "model": "gpt-4o-mini",
  "choices": [
    {
      "index": 0,
      "message": { "role": "assistant", "content": "Hello!" },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 5,
    "total_tokens": 17
  }
}

Response (streaming)

data: {"id":"chatcmpl-...","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"role":"assistant"}}]}

data: {"id":"chatcmpl-...","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":"Hello"}}]}

data: {"id":"chatcmpl-...","object":"chat.completion.chunk","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: [DONE]

Examples

curl

curl https://api.prxy.monster/v1/chat/completions \
  -H "Authorization: Bearer prxy_live_xxx" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [{ "role": "user", "content": "hi" }]
  }'

OpenAI SDK (Node)

import OpenAI from 'openai';
 
const client = new OpenAI({
  baseURL: 'https://api.prxy.monster/v1',
  apiKey: process.env.PRXY_KEY,
});
 
const res = await client.chat.completions.create({
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: 'hi' }],
});

OpenAI SDK (Python)

from openai import OpenAI
 
client = OpenAI(
    base_url="https://api.prxy.monster/v1",
    api_key=os.environ["PRXY_KEY"],
)
 
res = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "hi"}],
)

Cross-provider routing

The model field is the routing signal. The gateway maps supported model names or prefixes to configured providers:

{ "model": "claude-sonnet-4-6", "messages": [...] }   // → Anthropic
{ "model": "gpt-4o", "messages": [...] }              // → OpenAI
{ "model": "gemini-2.0-pro", "messages": [...] }      // → Google, when configured

Translation is automatic. A request that says model: 'claude-sonnet-4-6' against /v1/chat/completions (OpenAI shape) gets converted to canonical, sent to Anthropic, response converted back to OpenAI shape on the way out.

Errors

Same error shape and codes as POST /v1/messages, except wrapped in OpenAI’s error format:

{
  "error": {
    "type": "cost_limit_per_day",
    "message": "Daily cost cap exceeded",
    "code": "cost_limit_per_day"
  }
}