Context Compression for MCP

The single biggest threat to agent reliability is context pollution. And the biggest source of context pollution is tool outputs.

What Is Context Pollution?

Every time an agent calls a tool, the result goes directly into the conversation history. The agent reads it, reasons about it, and builds its next response.

This works beautifully when tool outputs are small — a search result with 10 items, a weather API response, a simple database lookup. But real-world tools return real-world data:

A database query returns 500,000 rows
A log inspection returns 50,000 lines
A documentation crawler returns an entire codebase
A monitoring tool returns 10 MB of JSON metrics

Once this data enters context, it stays there. The agent's reasoning window fills up. The quality of subsequent responses drops sharply. And you pay for every token — again and again on every subsequent turn.

Two Strategies: Compression and Materialization

1. Context Compression

Instead of dumping raw tool output into context, compress it into a concise summary. The agent gets the signal without the noise.

For example, a database query returning 500,000 rows becomes:

{
  "summary": "Query returned 500,000 rows. Top 10 customers by revenue: ...",
  "schema": ["id", "name", "revenue", "date"],
  "sample": [{"id": 1, "name": "Acme Corp", "revenue": 12000000}]
}

The agent has everything it needs to answer questions about the data. If it needs to drill into specific rows, it can request them explicitly.

2. Automatic Materialization

For truly large results, compression alone is not enough. The raw data still needs to be accessible. Automatic materialization writes the full result to a file — Parquet, CSV, or JSONL — and gives the agent only the metadata.

The agent sees:

{
  "file": "query_20260621.parquet",
  "rows": 500000,
  "size_mb": 12.4
}

Context stays clean. The data is available on disk for any downstream processing. The agent pays pennies for metadata instead of dollars for raw context.

The Context Firewall

AnyMCP implements both strategies through what we call the Context Firewall. Every tool response passes through it:

Is this response small? Pass it through as-is.
Is this response large but compressible? Generate a summary.
Is this response massive? Materialize to file, return metadata.

The goal is simple: never let a tool output pollute an agent's context window. The agent stays focused, reasoning stays sharp, and token costs stay under control.

The Bottom Line