The Context Window Problem

The Problem

Enterprise repositories contain millions of tokens across thousands of files. Frontier models offer context windows of 1-2 million tokens at best. Larger windows won't save you: they degrade quality through uneven attention, cost more, and drown the model in irrelevant information.

Seven Context Types Agents Need

Effective coding agents require more than just code files:

Task descriptions - concrete objectives
Available tools - system resources the agent can use
Developer persona - environment and preferences
Code files - the actual files being modified
Semantic structure - architectural patterns and business rules
Historical context - commits and documentation
Collaborative context - team standards and style guides

Why Vector Search Falls Short

Naive RAG retrieval fails for code because it:

Flattens hierarchical structure into undifferentiated chunks
Struggles with multi-hop reasoning across interconnected systems
Floods models with irrelevant files, degrading reasoning

Factory's Context Stack

Five layers working together:

Repository Overviews - auto-generated architectural summaries
Semantic Search - code-tuned embeddings returning ranked candidates
File System Commands - targeted access with line-number specs
Enterprise Integrations - Sentry, Notion, and similar platforms
Hierarchical Memory - persistent user and org preferences

This connects to 12-factor-agents's Factor 3: "Own Your Context Window." Small, focused agents outperform monolithic ones because they stay within context limits. building-effective-agents makes the same point: composable patterns like prompt chaining exist specifically to work within these constraints.

The real insight echoes how-i-use-llms: context windows are working memory, and keeping them lean beats stuffing them full. Quality over quantity.

The Problem

Seven Context Types Agents Need

Why Vector Search Falls Short

Factory's Context Stack

See Also