io.github.shinpr/mcp-local-rag
Easy-to-setup local RAG server with minimal configuration
Verdict not yet evaluated for this tool. The semantic screen takes adversarial cases first; coverage rolls out as the corpus expands (15/150 labels to graduation). The deterministic conformance probe is built but has not yet run on the public corpus, so a recorded verdict here is REVIEW or UNVERIFIED, never a clearing ALLOW. Until a verdict is recorded, an agent should treat this tool as not-yet-cleared and fall back to its own checks. Method: the eval, four-state verdict, honest limits.
Own this server? Screen its description →
BASE_DIRBase directory for document storage (defaults to current working directory). Ignored when BASE_DIRS is set.
BASE_DIRSJSON array of base directories (e.g. '["/a","/b"]'). Takes precedence over BASE_DIR.
DB_PATHPath to LanceDB database directory (defaults to ./lancedb/)
CACHE_DIRDirectory where Transformers.js models are cached (defaults to ./models/)
MODEL_NAMEEmbedding model name (defaults to Xenova/all-MiniLM-L6-v2)
MAX_FILE_SIZEMaximum file size in bytes (defaults to 104857600 / 100MB)
RAG_MAX_DISTANCEMaximum distance threshold for filtering search results. Results with distance greater than this value will be excluded. Lower values mean stricter filtering (e.g., 0.5 for high relevance only)
RAG_GROUPINGGrouping mode for quality filtering. 'similar' returns only the most similar group (stops at first distance jump). 'related' includes related groups (stops at second distance jump). Unset means no grouping filter
RAG_MAX_FILESMaximum number of files to keep in search results. Results are filtered to include only chunks from the top N best-scoring files. For example, 1 returns only the single best-matching file's chunks. Unset means no file filtering.
CHUNK_MIN_LENGTHMinimum chunk length in characters (1-10000, defaults to 50). Chunks shorter than this threshold are filtered out during ingestion.
RAG_DEVICEExecution device for the embedder (defaults to cpu). Passed straight to ONNX Runtime; see the Transformers.js device source for the supported backend names. If the requested device fails to initialize, the server throws an error.
RAG_DTYPEEmbedding quantization dtype for the embedder (defaults to fp32). Opt-in and pass-through; accepts any dtype the chosen model provides (fp32, fp16, q8, int8, ...). If the model has no variant for the requested dtype, the server throws an error. Changing this changes the embedding space — re-ingest existing data.
RAG_HYBRID_WEIGHTKeyword boost factor for hybrid search (0.0-1.0, defaults to 0.6). 0 means semantic similarity only; higher values increase the keyword-match contribution to the final score.
Persistent memory for AI assistants. Save once; recall from Claude, ChatGPT, or any MCP client.
Privacy-first work tracking with summaries, reports, coaching, and AI-ready long-term memory.
Expert-curated knowledge graphs for AI agents — PSFK Retail, Beauty, Sports and more.