Research agent
For agents that read 100+ sources, plan, draft, and iterate over a single output. Quality matters more than per-token cost. Sessions can run for hours.
What this pipeline is good at
- Sessions that span thousands of turns without losing structure (
ipc). - Persistent learning across research projects (
patterns). - Optionally: pulling earlier-archived sources back when relevant (
rehydrator).
The pipeline
Env var
PRXY_PIPE='ipc,patterns'Optional production additions:
PRXY_PIPE='ipc,patterns,rehydrator,router'
Why this order
ipcfirst — measures and compresses before any other module touches the prompt.patterns— injects priors after compression, on top of the (now smaller) prompt.rehydrator— pulls archived turns back when the user references them.router— picks the model based on the type of step (planning vs drafting vs verifying).
Why no caching?
Research is inherently exploratory — semantically-similar prompts often deserve different answers (different angles, different framing). Caching would lock the agent into yesterday’s interpretation.
If you’re building a research chatbot rather than a research agent, add semantic-cache with a high threshold (0.95+) at the front of the pipeline.
Cost vs quality dial
| Use case | Adjust |
|---|---|
| Quality is paramount (research papers, legal analysis) | Use Opus exclusively. Add cost-guard with a generous monthly cap. Skip router. |
| Volume + quality (research at scale) | Wait for router. It’ll route easy steps to Sonnet and hard steps to Opus. |
| Prototype (just exploring) | Add cost-guard with perRequest: 0.30 to keep accidental Opus blowouts bounded. |
Variants
Long-form drafting (essays, reports, codebases):
pipeline:
- ipc:
targetUtilization: 0.9 # squeeze more context
preserveLastTurns: 15 # keep recent drafting context verbatim
- patterns:
maxInjected: 10
minSuccessRate: 0.5
Multi-source research with attribution:
pipeline:
- ipc:
targetUtilization: 0.8
archiveToBlob: true # critical — sources can be rehydrated
- patterns: { maxInjected: 5 }