The PRD and technical plan is located in memory_mcp_server_prd.md
This is an MCP (Model Context Protocol) server that provides long-term memory capabilities for AI agents, enabling persistent storage, retrieval, and semantic search of contextual memories across sessions.
All project tasks, milestones, and epics are maintained via GitHub issues in the repository, using the GitHub MCP server for automation and tracking. The local TODO.md is used as a reference and synchronized with GitHub issues as needed.
Each issue must include a hierarchy block at the top of its description, formatted as follows:
For Milestone issues (top-level):
Parent:
Project: <project_link>
Children:
- <child_issue_link>
Related to:
- <related_issue_link>
For Epic issues (phase-level):
Parent: <parent_issue_link>
Project: <project_link>
Children:
- <child_issue_link>
Depends On:
- <prerequisite_issue_link>
Checklist tasks for the epic should be listed below this block.
- Create all new tasks, milestones, and epics as GitHub issues.
- Add the hierarchy block to each issue as described above.
- Link parent/child issues and project as appropriate.
- Use the GitHub MCP server to automate issue creation, updates, and tracking.
- Do not use TODO.md for new tasks; migrate any remaining items to issues.
- Keep checklists and progress in the issue body for visibility.
- For Epic issues, always include a link to the previous phase's issue in the 'Depends On' field to indicate dependency order.
- At the start of each milestone, update the top section to identify the milestone, expected outcome, and link to the milestone issue.
- Group all tasks by their Epic issue, including a link to each Epic issue.
- Keep task status (checked/unchecked) in sync with GitHub issues.
- Updates should flow both ways: from GitHub issues into TODO.md, and from TODO.md into GitHub issues.
- When a new milestone begins, archive or move the previous milestone's tasks as needed.
- Use this file as a local reference and synchronization point for milestone progress.
This process ensures all work is visible, organized, and tracked in the GitHub project and issues.
To use the MCP Memory Server as a memory backend for your agent, configure your agent or development environment to connect to the server's MCP endpoint.
Create or update your
mcp.json
file (location: VS Code user settings or agent config directory):
{
"servers": {
"MCP Memory Server": {
"url": "http://localhost:8139/mcp/",
"type": "http"
}
},
"inputs": []
}
{
"MCP Memory Server": {
"url": "http://localhost:8139/mcp/"
}
}
- Ensure the MCP server is running locally (
uv run mcp-memory-server
).
- Update the endpoint if running on a different host or port.
Notes:
- Agents may require a restart after changing environment variables or config files.
- If running in a container or remote environment, update the endpoint to match your network setup.
The MCP server must be running and accessible from your agent's environment.
- If you use Docker or remote servers, update the endpoint accordingly.
- For more advanced agent integrations, refer to your agent's documentation for MCP memory configuration options.
- Single Storage Backend: Focus only on ChromaDB implementation to reduce complexity
- Single Embedding Provider: Focus only on Ollama integration for local deployment
- Core Tools Only: Implement store_memories, retrieve_memories, search_memories
- Minimal Configuration: Use sensible defaults with basic .env configuration
- No Authentication: Skip auth complexity for MVP
- Basic Error Handling: Essential error handling without sophisticated retry logic
- Python 3.11+ required
- UV package manager for dependency management
- Ollama must be running locally on port 11434
- ChromaDB will store data in ./data/chroma_db directory
- FastMCP framework for MCP protocol implementation
- CLI can start and connect to Ollama
- Can store memories with context and metadata
- Can retrieve all memories for a given context
- Can perform semantic search within a context
- Memories persist between server restarts
- Basic error messages for common issues (Ollama down, etc.)
- fastmcp>=0.2.0
- pydantic>=2.5.0
- pydantic-settings>=2.1.0
- httpx>=0.25.0 (for Ollama API calls)
- numpy>=1.24.0 (for similarity calculations)
- chromadb>=0.4.0
- python-dotenv>=1.0.0
- uuid7>=0.1.0 (for unique memory IDs)
- Clone this repository
- Install UV package manager (installation guide)
- Copy
.env.example
to .env
and configure environment variables
- Install dependencies:
uv sync --dev
- Activate the virtual environment:
uv shell
- Install Ollama and start the service:
ollama serve
- Pull the embedding model:
ollama pull mxbai-embed-large
- Ensure Python 3.11+ is available
- Run the server:
uv run mcp-memory-server
src/
└── mcp_memory_server/
├── __init__.py
├── main.py # FastMCP server entry point
├── config/
│ ├── __init__.py
│ └── settings.py # Pydantic settings with .env support
├── models/
│ ├── __init__.py
│ └── memory.py # Memory data model with uuid7 IDs
├── embeddings/
│ ├── __init__.py
│ ├── embedding_provider_interface.py
│ └── ollama.py # Ollama API client implementation
├── storage/
│ ├── __init__.py
│ ├── storage_interface.py
│ └── chroma.py # ChromaDB implementation
├── services/
│ ├── __init__.py
│ └── memory_service.py # Business logic layer
└── tools/
├── __init__.py
└── memory_tools.py # MCP tool implementations
- store_memories(memories: List[Dict], context: str) - Store batch of memories
- retrieve_memories(context: str) - Get all memories for context
- search_memories(query: str, context: str, limit: int, threshold: float) - Semantic search
STORAGE_BACKEND=chroma
EMBEDDING_PROVIDER=ollama
CHROMA_PATH=./data/chroma_db
CHROMA_COLLECTION_NAME=memories
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=mxbai-embed-large
MAX_MEMORIES_PER_REQUEST=100
DEFAULT_SEARCH_LIMIT=10
SIMILARITY_THRESHOLD=0.7
- Unit tests for each component in isolation
- Integration tests for end-to-end workflows
- Manual testing with sample data and edge cases
- Performance testing with 100+ memories
# Note: If uv is not in PATH, use the full path:
# Windows: $env:USERPROFILE\.local\bin\uv.exe
# Install all dependencies including dev dependencies
uv sync --dev
# Run all tests
uv run pytest
# Run specific test file
uv run pytest tests/unit/test_models/test_memory.py
# Run tests with coverage
uv run pytest --cov=src/mcp_memory_server
# Run tests with verbose output
uv run pytest -v
# Add development dependencies
uv add --dev package_name
# Add production dependencies
uv add package_name
- Foundation (Steps 1-6): Core architecture and data flow
- MCP Integration (Steps 7-10): FastMCP tools and server setup
- Testing & Validation (Steps 11-14): Comprehensive testing
- Documentation (Steps 15-16): User docs and final packaging