Project Memory for Claude Code

Claude Code forgets.
pmem remembers.

Local semantic search over your project's docs, decisions, and history. Ollama embeddings, ChromaDB storage, MCP integration. No external APIs. No data leaves your machine.

The Problem

Every session starts from scratch.

You spent three sessions building a feature. Claude Code made decisions, documented trade-offs, recorded lessons learned across a dozen files. Next session? Gone. Claude reads your CLAUDE.md, maybe a few files you point it to, and guesses at the rest.

The workaround is grep. But grep matches text, not meaning. Search for "why did we pick ChromaDB" and grep returns nothing — because the answer lives in a paragraph that says "file-based persistence was simpler for our use case." The words don't overlap. The meaning does.

So you end up re-explaining context, re-discovering decisions, and watching Claude burn tokens reading files that don't have what it needs while missing the ones that do.

The Numbers

Grep vs. semantic search on a real project.

Same query, same codebase. One approach reads everything and hopes for keyword overlap. The other understands what you're asking.

Grep / File Search

Tokens consumed
~24,000
Time
~90s
Results returned
11
Relevant files missed
7

pmem Semantic Search

Tokens consumed
~5,500
Time
~20s
Results returned
18
Relevant files missed
0

Tested on a 40-file project with governance docs, task logs, and architectural decision records.

How It Works

Local embeddings. Vector search. MCP tools.

Header-aware chunking

Markdown files are split at heading boundaries, preserving the full heading hierarchy as context in each chunk. No arbitrary 500-token windows.

Ollama embeddings

Text is embedded locally using nomic-embed-text via Ollama. Nothing leaves your machine. No API keys for indexing or search.

ChromaDB storage

Vectors stored in a local ChromaDB collection. File-based persistence — no database server to manage. Lives in your project's .memory/ directory.

MCP integration

Exposes memory_query, memory_search, and memory_status as MCP tools. Claude Code calls them like any other tool.

Setup

Two minutes. Four commands.

Install the package, pull the embedding model, initialize your project, and index. That's it.

terminal

# Install pmem

pip install pmem-tool

# Pull the embedding model (~274 MB, one time)

ollama pull nomic-embed-text

# Initialize and index your project

cd your-project/

pmem init && pmem index

# That's it. Claude Code can now query your project memory.

Requires Python 3.11+ and Ollama running locally. See the full documentation for MCP server configuration.

Stop re-explaining context every session.

Free, open source, and local-first. Give Claude Code the memory it should have had from the start.