structured-crusher

Category: context · Cloud + Local · Status: v1 — production

Large tool_result payloads (10k-row API responses, log dumps, search results) dominate context. structured-crusher head/tail-samples arrays, truncates long strings, and drops empty fields — only on tool_result blocks by default.

What it does

Inspired by Headroom’s SmartCrusher, reimplemented in TypeScript with no model download and no network — safe for airgapped local mode.

Before: 12,400 tokens of JSON in a tool_result
After:    ~800 tokens with __prxy_crushed marker + sampled data

Every crushed payload carries a visible __prxy_crushed marker. Nothing is silently dropped.

When to use it

✅ Agents that read large JSON APIs (databases, search, analytics) ✅ MCP tools returning bulky structured data ✅ Default pipeline (runs after cache lookup, before ipc)

❌ One-shot prompts with no tool results ❌ Workloads where every byte of tool output must reach the model verbatim (disable or skip)

Configuration

structured-crusher:
  minBlockTokens: 200
  maxArrayItems: 8
  maxStringChars: 500
  dropEmpty: true
  targets: ['tool_result']
  crushUserSystem: false
  marker: true
  ccr: false
  ccrPrefix: ccr

Set ccr: true to store originals in blob storage before crushing. Pair with ccr-inject + ccr-retrieve for reversible compression.

Metrics emitted

structured-crusher.blocks (number)
structured-crusher.tokens.before (number)
structured-crusher.tokens.after (number)
structured-crusher.tokens.saved (number)
structured-crusher.ccr (boolean, when ccr enabled)

Recommended pipeline placement

mcp-optimizer → exact-cache → semantic-cache → structured-crusher → code-crusher → ipc → patterns

Crushers run after caches (cache keys hash the original request) and before ipc (ipc only summarizes leftovers).

CCR stack (optional)

- module: structured-crusher
  config: { ccr: true }
- code-crusher
- ccr-inject
- ccr-retrieve

See ccr-inject and ccr-retrieve.