prxy.monster API-key BYOK is live. Start free

Using prxy.monster with LlamaIndex (Python)

LlamaIndex Python’s OpenAI LLM class accepts an api_base constructor arg. Set it once, every query engine / retriever / agent inherits the routing.

Install

pip install llama-index llama-index-llms-openai llama-index-embeddings-openai

Configure

from llama_index.llms.openai import OpenAI
from llama_index.core import Settings
 
Settings.llm = OpenAI(
    model="gpt-4o",
    api_base="https://api.prxy.monster/v1",
    api_key="prxy_live_xxxxxxxxxxxxxxxxxxxxxxxx",
)

Or per-LLM (no global default):

llm = OpenAI(
    model="gpt-4o",
    api_base="https://api.prxy.monster/v1",
    api_key="prxy_live_xxxxxxxxxxxxxxxxxxxxxxxx",
)
 
query_engine = index.as_query_engine(llm=llm)

Or via env var (no constructor change):

export OPENAI_BASE_URL=https://api.prxy.monster/v1
export OPENAI_API_KEY=prxy_live_xxxxxxxxxxxxxxxxxxxxxxxx
from llama_index.llms.openai import OpenAI
Settings.llm = OpenAI(model="gpt-4o")  # picks up env vars

Verify

curl https://api.prxy.monster/health

Run a query — successful response confirms routing.

What you get

Embeddings

The same api_base arg works for OpenAIEmbedding:

from llama_index.embeddings.openai import OpenAIEmbedding
 
Settings.embed_model = OpenAIEmbedding(
    model="text-embedding-3-small",
    api_base="https://api.prxy.monster/v1",
    api_key="prxy_live_xxxxxxxxxxxxxxxxxxxxxxxx",
)

Both your LLM calls AND your embedding calls route through prxy.monster.

Anthropic provider

If you’re using llama-index-llms-anthropic, the same pattern applies — pass the prxy base URL:

from llama_index.llms.anthropic import Anthropic
 
Settings.llm = Anthropic(
    model="claude-sonnet-4-6",
    base_url="https://api.prxy.monster",
    api_key="prxy_live_xxxxxxxxxxxxxxxxxxxxxxxx",
)

For RAG:

PRXY_PIPE=exact-cache,semantic-cache,patterns,cost-guard

For agents with tool use:

PRXY_PIPE=mcp-optimizer,semantic-cache,patterns,ipc

Common issues

Full example

Adapt examples/openai-quickstart — replace the OpenAI client with Settings.llm = OpenAI(...) as shown above.

Verify the exact constructor argument name with the LlamaIndex Python docs for your installed version. api_base is stable across recent versions.