prxy.monster API-key BYOK is live. Start free

POST /v1/messages

Anthropic Messages-compatible API. The route accepts the common /v1/messages request shape and returns Anthropic-style responses and SSE. Provider-specific account APIs are not proxied.

Endpoint

POST https://api.prxy.monster/v1/messages

Local mode: http://localhost:3099/v1/messages.

Headers

HeaderRequiredNotes
Authorization: Bearer <key>yesYour prxy_live_xxx key. Local mode: any string.
Content-Type: application/jsonyes
x-prxy-pipenoOverride pipeline for this request only. Comma list of module names.
anthropic-versionnoForwarded to provider.
anthropic-betanoForwarded to provider.

Request body

The full Anthropic Messages schema. The gateway validates with Zod and forwards to the provider.

{
  model: string;                  // e.g. "claude-sonnet-4-6"
  max_tokens: number;             // positive integer
  messages: Array<{
    role: 'user' | 'assistant';
    content: string | ContentBlock[];
  }>;
  system?: string | SystemBlock[];
  temperature?: number;           // 0–2
  top_p?: number;                 // 0–1
  top_k?: number;                 // positive integer
  stop_sequences?: string[];
  stream?: boolean;
  tools?: Tool[];
  metadata?: Record<string, unknown>;
}

ContentBlock is one of: text, image, tool_use, tool_result. SystemBlock supports cache_control for prompt caching.

Response (non-streaming)

{
  "id": "msg_01abc...",
  "type": "message",
  "role": "assistant",
  "model": "claude-sonnet-4-6",
  "content": [
    { "type": "text", "text": "Hello!" }
  ],
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 12,
    "output_tokens": 5,
    "cache_read_input_tokens": 0,
    "cache_creation_input_tokens": 0
  }
}

Response (streaming, stream: true)

Server-Sent Events with Anthropic’s event-typed envelope:

event: message_start
data: { "type": "message_start", "message": { ... } }

event: content_block_start
data: { "type": "content_block_start", "index": 0, "content_block": { ... } }

event: content_block_delta
data: { "type": "content_block_delta", "index": 0, "delta": { "type": "text_delta", "text": "Hello" } }

event: content_block_stop
data: { "type": "content_block_stop", "index": 0 }

event: message_delta
data: { "type": "message_delta", "delta": { "stop_reason": "end_turn" } }

event: message_stop
data: { "type": "message_stop" }

Cache hits on streaming requests are replayed as a synthetic stream in this exact format. Your client cannot distinguish a cache replay from a real stream — same events, same field shapes.

Examples

curl https://api.prxy.monster/v1/messages \
  -H "Authorization: Bearer prxy_live_xxx" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "max_tokens": 1024,
    "messages": [
      { "role": "user", "content": "Write a haiku about distributed systems." }
    ]
  }'

Per-request pipeline override

curl https://api.prxy.monster/v1/messages \
  -H "Authorization: Bearer prxy_live_xxx" \
  -H "x-prxy-pipe: exact-cache,patterns" \
  -H "Content-Type: application/json" \
  -d '{ ... }'

The override applies to this single call only. Useful for A/B testing.

Error codes

Statuserror.typeWhen
400invalid_requestBody fails schema validation.
401authentication_errorMissing / malformed / revoked key.
402payment_requiredOut of credits (cloud paid tier).
403permission_errorAction not allowed in current mode (e.g. local-mode billing).
404not_foundEndpoint or resource not found.
429cost_limit_per_request / cost_limit_per_day / cost_limit_per_monthcost-guard enforced a cap.
429rate_limit_errorPer-key request rate limit exceeded.
502upstream_errorProvider returned 5xx after all retries.
503service_unavailableStorage backend or critical dependency down.

Error body shape:

{
  "type": "error",
  "error": {
    "type": "cost_limit_per_day",
    "message": "Daily cost cap exceeded",
    "limit": 5.00,
    "spent": 4.87,
    "estimated": 0.21,
    "resets_at": "2026-04-28T00:00:00.000Z"
  }
}