Project Memory for Claude Code
Claude Code forgets.
pmem remembers.
Local semantic search over your project's docs, decisions, and history. Ollama embeddings, ChromaDB storage, MCP integration. No external APIs. No data leaves your machine.
The Problem
Every session starts from scratch.
You spent three sessions building a feature. Claude Code made decisions, documented trade-offs, recorded lessons learned across a dozen files. Next session? Gone. Claude reads your CLAUDE.md, maybe a few files you point it to, and guesses at the rest.
The workaround is grep. But grep matches text, not meaning. Search for "why did we pick ChromaDB" and grep returns nothing — because the answer lives in a paragraph that says "file-based persistence was simpler for our use case." The words don't overlap. The meaning does.
So you end up re-explaining context, re-discovering decisions, and watching Claude burn tokens reading files that don't have what it needs while missing the ones that do.
The Numbers
Grep vs. semantic search on a real project.
Same query, same codebase. One approach reads everything and hopes for keyword overlap. The other understands what you're asking.
Grep / File Search
- Tokens consumed
- ~24,000
- Time
- ~90s
- Results returned
- 11
- Relevant files missed
- 7
pmem Semantic Search
- Tokens consumed
- ~5,500
- Time
- ~20s
- Results returned
- 18
- Relevant files missed
- 0
Tested on a 40-file project with governance docs, task logs, and architectural decision records.
How It Works
Local embeddings. Vector search. MCP tools.
Header-aware chunking
Markdown files are split at heading boundaries, preserving the full heading hierarchy as context in each chunk. No arbitrary 500-token windows.
Ollama embeddings
Text is embedded locally using nomic-embed-text via Ollama. Nothing leaves your machine. No API keys for indexing or search.
ChromaDB storage
Vectors stored in a local ChromaDB collection. File-based persistence — no database server to manage. Lives in your project's .memory/ directory.
MCP integration
Exposes memory_query, memory_search, and memory_status as MCP tools. Claude Code calls them like any other tool.
Setup
Two minutes. Four commands.
Install the package, pull the embedding model, initialize your project, and index. That's it.
# Install pmem
pip install pmem-tool
# Pull the embedding model (~274 MB, one time)
ollama pull nomic-embed-text
# Initialize and index your project
cd your-project/
pmem init && pmem index
# That's it. Claude Code can now query your project memory.
Requires Python 3.11+ and Ollama running locally. See the full documentation for MCP server configuration.
Stop re-explaining context every session.
Free, open source, and local-first. Give Claude Code the memory it should have had from the start.