Markdown Converter
Agent skill for markdown-converter
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Sign in to like and favorite skills
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
NEVER use emojis in ANY documentation, plans, guides, or written output for this project UNLESS explicitly given permission. This includes:
Focus on clear, professional documentation without decorative elements.
RAG-CLI v2.0 is a local Retrieval-Augmented Generation system designed as a Claude Code plugin. It processes documents locally, generates embeddings, stores vectors in ChromaDB, and uses claude-haiku-4-5-20251001 for response generation.
This version features a complete restructure with clean separation between core library and plugin code, marketplace-ready lifecycle management, and improved maintainability.
RAG-CLI/ src/ rag_cli/ # CORE LIBRARY (plugin-agnostic) __init__.py # Version: 2.0.0 core/ # Core RAG functionality constants.py # Centralized configuration document_processor.py # Document chunking embeddings.py # Embedding generation vector_store.py # ChromaDB operations retrieval_pipeline.py # Hybrid search + reranking claude_integration.py # Claude API integration [30+ other core modules] agents/ # Multi-agent framework base_agent.py # Agent base class query_decomposer.py # Query decomposition result_synthesizer.py # Result synthesis maf/ # Multi-Agent Framework integrations/ # External integrations arxiv_connector.py # ArXiv integration tavily_connector.py # Tavily search maf_connector.py # MAF integration cli/ # Command-line tools index.py # rag-index command retrieve.py # rag-retrieve command utils/ # Shared utilities rag_cli_plugin/ # PLUGIN CODE (Claude Code specific) __init__.py # Version: 2.0.0 lifecycle/ # Lifecycle management installer.py # Marketplace installation updater.py # Update handling commands/ # Slash commands update_rag.py # /update-rag command rag_project_indexer.py # /rag-project command [other commands] hooks/ # Event hooks user-prompt-submit.py # Main RAG orchestration document-indexing.py # Auto-indexing session-start.py # Session initialization [other hooks] mcp/ # MCP server unified_server.py # Single unified MCP server services/ # Plugin services service_manager.py # Service registry dashboard.py # Web dashboard tcp_server.py # Monitoring server [monitoring modules] skills/ # Agent skills config/ # Configuration defaults/ # Default configurations mcp.json # MCP server config rag_settings.json # RAG settings services.json # Service settings [other defaults] templates/ # User-editable templates .env.template # Environment template citation_config.json.template schemas/ # JSON schemas settings.schema.json # Settings validation scripts/ # Scripts install/ # Installation scripts update/ # Update scripts utils/ # Utility scripts update_imports_v2.py # Import updater update_plugin_imports.py # Plugin import updater .claude-plugin/ # Plugin metadata plugin.json # Plugin configuration (v2.0.0) hooks.json # Hook configurations lifecycle.json # Lifecycle hooks (NEW) commands/ # Command documentation data/ # Runtime data vectors/ # ChromaDB indexes cache/ # Query cache documents/ # Source documents logs/ # Application logs tests/ # Test suite docs/ # Documentation pyproject.toml # Package configuration (v2.0.0) requirements.txt # Python dependencies README.md # Project README LICENSE # MIT License CHANGELOG.md # Version history
RAG-CLI v2.0 uses a dual-package src-layout structure:
Core Library (rag_cli): Platform-agnostic RAG engine
from rag_cli.core.X import Yfrom rag_cli.agents.X import Yfrom rag_cli.integrations.X import YPlugin Code (rag_cli_plugin): Claude Code integration
from rag_cli_plugin.services.X import Yfrom rag_cli_plugin.lifecycle.X import YThis separation allows the core RAG engine to be used independently while keeping Claude Code-specific code isolated.
# Create virtual environment python -m venv venv source venv/bin/activate # Windows: venv\Scripts\activate # Install dependencies pip install -r requirements.txt # Install package in editable mode (development) pip install -e .
# Install as Claude Code plugin python install_plugin.py # This will: # 1. Install RAG-CLI as Python package (pip install -e .) # 2. Create plugin directory in ~/.claude/plugins/rag-cli/ # 3. Copy configuration files and commands # 4. Set up data directory symlinks # 5. Configure MCP server
# Index documents (after installation) rag-index ./data/documents --recursive --pattern "*.md" # Retrieve and generate responses rag-retrieve --query "How to configure API?" --top-k 5 # Interactive retrieval mode rag-retrieve --interactive # Run monitoring server rag-monitor # Or: python -m monitoring # Test installation python scripts/verify_installation.py
# Run all tests pytest # Run specific test module pytest tests/test_vector_store.py # Run with coverage pytest --cov=src --cov-report=html # Run integration tests only pytest tests/test_integration.py -v
Centralized configuration values for easier maintenance and tuning:
TCP_CHECK_CACHE_SECONDS, RESPONSE_CACHE_MAX_SIZE, EMBEDDING_CACHE_SIZECHARS_PER_TOKEN, TOKEN_ESTIMATION_RATIODEFAULT_TOP_K, MAX_TOP_K, MAX_QUERY_LENGTHDEFAULT_VECTOR_WEIGHT (0.7), DEFAULT_KEYWORD_WEIGHT (0.3)CHUNK_SIZE_TOKENS (500), CHUNK_OVERLAP_TOKENS (100), MAX_FILE_SIZE_MBHNSW_THRESHOLD_VECTORS (2000), IVF_THRESHOLD_VECTORS (1M)DEFAULT_BATCH_SIZE (32), MAX_WORKERS (4)MAX_EVENT_HISTORY, METRICS_HISTORY_SIZETAVILY_FREE_TIER_LIMIT, CLAUDE_RATE_LIMIT_REQUESTSDEFAULT_HTTP_TIMEOUT, EMBEDDING_TIMEOUT, SEARCH_TIMEOUTAll magic numbers throughout the codebase should reference these constants for consistency and maintainability.
CHUNK_SIZE_TOKENS (500 tokens)CHUNK_OVERLAP_TOKENS (100 tokens, 20%)MAX_FILE_SIZE_MB (10 MB)DEFAULT_BATCH_SIZE (32)EMBEDDING_CACHE_SIZE (1000)HNSW_THRESHOLD_VECTORS (2000 vectors)HNSW_THRESHOLD_VECTORS)IVF_THRESHOLD_VECTORS (1M+ vectors)DEFAULT_VECTOR_WEIGHT (0.7) + DEFAULT_KEYWORD_WEIGHT (0.3)DEFAULT_TOP_K (5), max: MAX_TOP_K (100)SEARCH_TIMEOUT)RESPONSE_CACHE_MAX_SIZE (100)CLAUDE_RATE_LIMIT_REQUESTS (100/min)MAX_EVENT_HISTORY (100)METRICS_HISTORY_SIZE (1000)TCP_CHECK_CACHE_SECONDS (30)This section documents the enhanced persistence and update strategies implemented in v2.0 to ensure reliable document management across sessions.
data/vectors/chroma_db/vector_store.py:270-361)Use
upsert() instead of add() when re-indexing documents to prevent duplicates:
# PREFERRED: Update existing or insert new vector_store.upsert( embeddings=embeddings, texts=texts, ids=optional_ids, # If None, auto-generates metadata=metadata, sources=sources ) # OLD WAY: Always adds, creates duplicates on re-index vector_store.add(embeddings, texts, metadata, sources)
When to use upsert():
vector_store.py:492-612)Get all vectors from a source:
# Returns List[VectorMetadata] vectors = vector_store.get_by_source("path/to/document.md") print(f"Found {len(vectors)} chunks from document")
Delete all vectors from a source:
# Useful before re-indexing a modified file deleted_count = vector_store.delete_by_source("path/to/document.md")
Replace all vectors from a source:
# Combines delete + add in one operation new_ids = vector_store.update_by_source( source="path/to/document.md", embeddings=new_embeddings, texts=new_texts, metadata=new_metadata )
cli/index.py:124-181)The indexing pipeline now integrates content hash-based duplicate detection:
Incremental Indexing (skip unchanged documents):
rag-index ./docs --recursive --incremental
This mode:
Update Mode (replace changed documents):
rag-index ./docs --recursive --update
This mode:
Combine both modes:
rag-index ./docs --recursive --incremental --update
vector_store.py:186-225)All metadata is validated before storage:
Session Start (
):hooks/session-start.py:93-199
Session End (
):hooks/session-end.py:130-169
Always use upsert() for re-indexing:
# Good: Update existing entries vector_store.upsert(embeddings, texts, sources=sources) # Bad: Creates duplicates on re-index vector_store.add(embeddings, texts, sources=sources)
Use source-based operations for document management:
# Check if document is indexed existing = vector_store.get_by_source("doc.md") # Delete before re-indexing (if not using upsert) if existing: vector_store.delete_by_source("doc.md") # Or use update_by_source for atomic replace vector_store.update_by_source("doc.md", new_embeddings, new_texts)
Use incremental indexing for large document sets:
# First time: full index rag-index ./docs --recursive # Subsequent updates: only changed documents rag-index ./docs --recursive --incremental --update
Trust automatic persistence:
Monitor duplicate registry:
data/vectors/content_hashes.jsonRun the test suite to verify persistence and updates:
python test/test_chromadb_persistence.py
Tests verify:
Issue: Duplicates in index after re-indexing
--update flag or upsert() instead of add()Issue: "Collection does not exist" error
Issue: Vectors not persisting across sessions
Issue: Incremental indexing still processes all documents
All imports MUST use the new dual-package structure:
# CORRECT - Core library imports from rag_cli.core.config import get_config from rag_cli.core.embeddings import EmbeddingGenerator from rag_cli.agents.base_agent import BaseAgent from rag_cli.integrations.tavily_connector import TavilyConnector # CORRECT - Plugin imports from rag_cli_plugin.services.service_manager import ServiceManager from rag_cli_plugin.lifecycle.installer import install_dependencies from rag_cli_plugin.mcp.unified_server import MCPServer # INCORRECT - Old v1.x imports (DO NOT USE) from core.config import get_config from monitoring.logger import get_logger from plugin.mcp.unified_server import MCPServer from src.core.config import get_config
The package is installed using pip with both
rag_cli and rag_cli_plugin as top-level packages.
Create commits at these milestones:
Use conventional commits:
feature: new functionalityfix: bug fixesrefactor: code improvementstest: test additionsdocs: documentationTarget metrics:
# Enable debug logging import logging logging.basicConfig(level=logging.DEBUG) # Test embeddings from sentence_transformers import SentenceTransformer model = SentenceTransformer('all-MiniLM-L6-v2') embedding = model.encode("test query") print(f"Embedding shape: {embedding.shape}") # Should be (384,) # Test ChromaDB import chromadb client = chromadb.PersistentClient(path="./test_chroma") collection = client.get_or_create_collection(name="test") collection.add( embeddings=[embedding.tolist()], documents=["test query"], ids=["test1"] ) results = collection.query(query_embeddings=[embedding.tolist()], n_results=1) print(f"Search result: {results}") # Should return the test document
The PostToolUse hook (
src/rag_cli_plugin/hooks/response-post.py) is currently disabled due to a JSON parsing bug in the Claude Code plugin framework.
Impact:
Workaround:
.claude-plugin/hooks.json (line 40: "enabled": false)Resolution:
KNOWN_ISSUES.md for detailed information and testing instructionsFor Developers:
RAG-implementation.mdKNOWN_ISSUES.md