cognitive-cache

oss

Algorithmic context-window selection for LLM coding tools. Treats context as a constrained optimization problem, not retrieval.

status
shipped
started
2024
role
creator
stack
Pythonscikit-learnnetworkxHypothesis

Every LLM tool right now (Cursor, Claude Code, Copilot, all of them) decides what to put in the context window using heuristics: grep for some symbols, embed and cosine-similarity search, or just cram as many files as will fit. Nobody has an actual algorithm for this.

This project is an attempt to build one. Runs entirely local with no LLM calls, API keys, or cloud dependencies, and supports Python, JavaScript, TypeScript, Go, Rust, Java, Ruby, C, and C++.

Think of it this way:

Classic OSLLM EquivalentCurrent State
RAMContext windowManually managed
Virtual memory / page swapsContext eviction + retrievalCrude summarization
Process schedulerAgent orchestrationHand-coded loops
File system cacheKnowledge retrievalCosine similarity
Memory allocatorToken budget allocationNobody does this

Early computers had programmers manually managing memory addresses. Then virtual memory was invented, a single algorithm, and it unlocked everything we know as modern computing.

The LLM ecosystem is at the “manual memory management” stage right now. Context is the single most important resource where reasoning happens, yet every tool manages it with heuristics instead of optimization.

Cognitive-cache is building the “virtual memory” for LLM reasoning.

Benchmark — 23 real bug-fix PRs

approachavg file recall
cognitive-cache34.7%
embedding RAG25.9%
llm-triage25.7%
grep22.8%

Available as a Python library, CLI, MCP server (Claude Code / Cursor), and a GitHub Action. License: Apache-2.0.