Mind — Semantic code search

Last updated April 7, 2026

Hybrid BM25 + vector RAG over a codebase, built as a KB Labs plugin.

One line: Mind is a semantic code search engine — you ask a natural-language question about your codebase, and it returns relevant code with references.

Source: plugins/kb-labs-mind

The problem

grep finds strings. When you need to find a concept — "where does authentication actually happen?", "how does hybrid search combine BM25 and vector scores?" — string search falls apart. You either read the whole repo or ask a colleague who already has the mental map.

Mind gives you that mental map on demand.

What it does

Two commands do most of the work:

Bash

# Build an index from your codebase
pnpm kb mind rag-index --scope default
 
# Ask a question
pnpm kb mind rag-query --text "How does the state broker handle tenant isolation?" --agent

Under the hood it builds a hybrid index: lexical (BM25) + semantic (vector embeddings), stored in Qdrant. Queries are rewritten by an LLM into multiple sub-queries, results are fused via Reciprocal Rank Fusion, and — the part that matters — every claim in the answer is verified against the actual retrieved text before it reaches you.

Architecture sketch

mind-cli (commands)
  └─ mind-orchestrator  (query rewriting, mode selection: instant / auto / thinking)
      └─ mind-engine    (hybrid search: BM25 + vectors + RRF)
          ├─ mind-vector-store  (Qdrant)
          ├─ mind-embeddings    (OpenAI or deterministic fallback)
          └─ mind-indexer       (incremental, watches file changes)

KB Labs primitives used

Reading the manifest tells you the whole story of what Mind needs from the platform:

TypeScript

platform: {
  requires: ['llm', 'embeddings', 'vectorStore', 'cache', 'storage'],
  optional: ['analytics', 'logger'],
},

So Mind leans on:

llm — for query rewriting and answer synthesis.
embeddings — for vector search (with a deterministic fallback so it still works offline).
vectorStore — namespace-scoped (mind:) access to the platform's Qdrant adapter.
cache — to avoid recomputing embeddings and LLM calls.
CLI commands — mind:init, mind:verify, mind:rag-index, mind:rag-query, plus mind:sync-* for incremental updates.

Notice what's not there: Mind doesn't open database connections, doesn't ship with an LLM SDK, doesn't know which vector store backs it. Those are platform services — the plugin just declares what it needs.

What to steal

If you're building a plugin that does "search over X with embeddings", the interesting pattern is the mode selection in the orchestrator: instant (no LLM rewriting), auto (one rewrite pass), thinking (multi-step reasoning). It's a clean way to let users trade latency for quality without forcing them to understand your internals.

Going deeper

Reference manifest
ADRs on search architecture — hybrid search, anti-hallucination, adaptive weights
Guide: Your First Plugin — if Mind inspired you to build something