Using prxy.monster with the OpenAI SDK
prxy.monster exposes an OpenAI Chat Completions-compatible API at https://api.prxy.monster/v1. Clients using chat.completions.create can point at prxy.monster with a base URL change.
Install
npm install openai
# or
pip install openai
Configure
The official OpenAI client respects OPENAI_BASE_URL for Chat Completions calls in Node and Python.
export OPENAI_BASE_URL=https://api.prxy.monster/v1
export OPENAI_API_KEY=prxy_live_xxxxxxxxxxxxxxxxxxxxxxxx
Code change
None. Both openai (Node) and openai (Python) auto-pick up the env var.
Node
// Before AND after — no diff
import OpenAI from 'openai';
const client = new OpenAI();
const r = await client.chat.completions.create({
model: 'gpt-4o-mini',
messages: [{ role: 'user', content: 'hi' }],
});If you prefer explicit:
const client = new OpenAI({
baseURL: 'https://api.prxy.monster/v1',
apiKey: process.env.OPENAI_API_KEY,
});Verify
curl https://api.prxy.monster/health
Or, with the CLI:
prxy doctor
What you get
- Infinite context —
chat.completions.createcalls compress old turns instead of dropping them. - Semantic cache — similar prompts hit cache, return in 15-30ms.
- Pattern memory — successful answers get learned and re-injected.
- Cost guards — hard per-request budget caps before the OpenAI bill arrives.
Recommended pipeline
PRXY_PIPE=mcp-optimizer,semantic-cache,patterns,ipc
For batch / cost-sensitive workloads, add exact-cache first:
PRXY_PIPE=exact-cache,semantic-cache,cost-guard,patterns
Streaming
const stream = await client.chat.completions.create({
model: 'gpt-4o-mini',
messages: [{ role: 'user', content: 'tell a story' }],
stream: true,
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content ?? '');
}
Works identically. Cache hits replay as synthetic SSE.
Common issues
- Function calling / tools — pass-through.
mcp-optimizerprunes irrelevant tool defs automatically if you ship many. response_format: { type: 'json_object' }— pass-through.- Responses API (
/v1/responses) — planned, not proxied today. - Assistants API (
/v1/assistants, threads, runs) — not proxied. Use Chat Completions instead. - Realtime API — not proxied.
- Embeddings (
/v1/embeddings) — not a public proxy route today; embeddings are used internally by cache modules.
Full example
Plain Node script: github.com/Ekkos-Technologies-Inc/prxy-monster-examples/tree/main/examples/openai-quickstart
prxy.monster speaks the OpenAI Chat Completions wire format. Newer OpenAI features (Responses API, Realtime API) are not yet proxied — track /changelog for support.