cognitive-cache

oss

Algorithmic context-window selection for LLM coding tools. Treats context as a constrained optimization problem, not retrieval.

status: shipped
started: 2024
role: creator
stack: Pythonscikit-learnnetworkxHypothesis

github ↗

Every LLM tool right now (Cursor, Claude Code, Copilot, all of them) decides what to put in the context window using heuristics: grep for some symbols, embed and cosine-similarity search, or just cram as many files as will fit. Nobody has an actual algorithm for this.

This project is an attempt to build one. Runs entirely local with no LLM calls, API keys, or cloud dependencies, and supports Python, JavaScript, TypeScript, Go, Rust, Java, Ruby, C, and C++.

Think of it this way:

Classic OS	LLM Equivalent	Current State
RAM	Context window	Manually managed
Virtual memory / page swaps	Context eviction + retrieval	Crude summarization
Process scheduler	Agent orchestration	Hand-coded loops
File system cache	Knowledge retrieval	Cosine similarity
Memory allocator	Token budget allocation	Nobody does this

Early computers had programmers manually managing memory addresses. Then virtual memory was invented, a single algorithm, and it unlocked everything we know as modern computing.

The LLM ecosystem is at the “manual memory management” stage right now. Context is the single most important resource where reasoning happens, yet every tool manages it with heuristics instead of optimization.

Cognitive-cache is building the “virtual memory” for LLM reasoning.

Benchmark — 23 real bug-fix PRs

approach	avg file recall
cognitive-cache	34.7%
embedding RAG	25.9%
llm-triage	25.7%
grep	22.8%

Available as a Python library, CLI, MCP server (Claude Code / Cursor), and a GitHub Action. License: Apache-2.0.