prxy.monster API-key BYOK is live. Start free

router

Category: routing · Cloud + Local · Status: v1.0 — production

Picks the right model for each request. Falls back through a configured chain. The cloud edition records per-(query bucket, model) outcomes and learns over time which model produces high-quality results most cheaply.

What it does

Three strategies, swappable at any time:

The originally-requested model is preserved in metadata['router.requested_model'] so downstream modules and observability dashboards can see what the client asked for vs. what shipped.

When to use it

Configuration

router:
  strategy: 'cheapest-first'  # 'q-learning' | 'fallback' | 'cheapest-first'
  fallback_chain:
    - claude-sonnet-4-6
    - gpt-4o
    - gemini-2.0-pro
  prefer:                     # try these first if confidence high
    - claude-haiku-4-5
  budget_per_request: 0.10    # never pick a model whose estimate exceeds this

Metrics emitted

How it works

  1. Pre hook:

    • Record the requested model in metadata.
    • Build the candidate list: prefer first, then fallback_chain, dedup. Always include the requested model.
    • Filter out anything above budget_per_request (estimate per model).
    • Apply the strategy:
      • cheapest-first → sort by pricing.input + pricing.output, take the cheapest.
      • fallback → take candidates[0].
      • q-learning → look up historical success rate per (bucket, model) and take the highest-rated. Cold start falls back to cheapest-first.
    • Mutate request.model to the selection.
  2. Post hook (q-learning only):

    • Increment the (bucket, model) counter; add 1 to n always, add 1 to s if the response was successful.
    • 30-day TTL on stat rows.

Migration note

If you’re coming from OpenRouter and using their fallback feature, this is the module that replicates it.

Source

src/modules/router.ts