```QMD - 快速 Markdown 搜索```
QMD - Quick Markdown Search

原始链接: https://github.com/tobi/qmd

## QMD:您的本地设备搜索引擎 QMD 是一款强大的本地运行搜索引擎,旨在快速查找笔记、文档和记录中的信息。它索引 Markdown 文件、会议记录和知识库,提供关键词和自然语言搜索功能,非常适合代理工作流程。 QMD 结合了三种搜索方法:**BM25**(快速关键词搜索)、**向量语义搜索**和 **LLM 重排序**——所有这些都由 `node-llama-cpp` 和 GGUF 模型提供支持(自动下载)。 **主要特点:** * **集合:** 将内容组织成可搜索的组(例如,“笔记”、“会议”)。 * **上下文:** 为集合添加描述性元数据,以提高搜索相关性。 * **搜索模式:** 在快速关键词搜索 (`vsearch`)、语义搜索 (`query`) 或混合方法之间进行选择,以获得最佳结果。 * **输出格式:** 将结果导出为 JSON、Markdown 或文件,以便与代理无缝集成。 * **MCP 服务器:** 通过模型上下文协议与 Claude 等工具集成。 **安装:** `bun install -g https://github.com/tobi/qmd` QMD 利用复杂的搜索流程,包括查询扩展、倒数排名融合和位置感知混合,以提供高度相关的结果。它将数据存储在本地 SQLite 数据库中,并利用三个 GGUF 模型进行嵌入、重排序和查询扩展。

Hacker News 新闻 | 过去 | 评论 | 提问 | 展示 | 招聘 | 提交 登录 QMD - 快速 Markdown 搜索 (github.com/tobi) 6 分,由 saikatsg 1小时前发布 | 隐藏 | 过去 | 收藏 | 1 条评论 xal 11分钟前 [–] 你好橙色页面,我是作者。QMD是我在与搜索和检索团队会议中学习到的最佳实践的实现。我试图使其不矫枉过正并保持本地化。目前正在努力微调更好的查询扩展和重排序模型(finetune 分支)。欢迎参与帮助。自写代码和 AI PR 均可。回复 指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请 YC | 联系 搜索:
相关文章

原文

An on-device search engine for everything you need to remember. Index your markdown notes, meeting transcripts, documentation, and knowledge bases. Search with keywords or natural language. Ideal for your agentic flows.

QMD combines BM25 full-text search, vector semantic search, and LLM re-ranking—all running locally via node-llama-cpp with GGUF models.

# Install globally
bun install -g https://github.com/tobi/qmd

# Create collections for your notes, docs, and meeting transcripts
qmd collection add ~/notes --name notes
qmd collection add ~/Documents/meetings --name meetings
qmd collection add ~/work/docs --name docs

# Add context to help with search results
qmd context add qmd://notes "Personal notes and ideas"
qmd context add qmd://meetings "Meeting transcripts and notes"
qmd context add qmd://docs "Work documentation"

# Generate embeddings for semantic search
qmd embed

# Search across everything
qmd search "project timeline"           # Fast keyword search
qmd vsearch "how to deploy"             # Semantic search
qmd query "quarterly planning process"  # Hybrid + reranking (best quality)

# Get a specific document
qmd get "meetings/2024-01-15.md"

# Get a document by docid (shown in search results)
qmd get "#abc123"

# Get multiple documents by glob pattern
qmd multi-get "journals/2025-05*.md"

# Search within a specific collection
qmd search "API" -c notes

# Export all matches for an agent
qmd search "API" --all --files --min-score 0.3

QMD's --json and --files output formats are designed for agentic workflows:

# Get structured results for an LLM
qmd search "authentication" --json -n 10

# List all relevant files above a threshold
qmd query "error handling" --all --files --min-score 0.4

# Retrieve full document content
qmd get "docs/api-reference.md" --full

Although the tool works perfectly fine when you just tell your agent to use it on the command line, it also exposes an MCP (Model Context Protocol) server for tighter integration.

Tools exposed:

  • qmd_search - Fast BM25 keyword search (supports collection filter)
  • qmd_vsearch - Semantic vector search (supports collection filter)
  • qmd_query - Hybrid search with reranking (supports collection filter)
  • qmd_get - Retrieve document by path or docid (with fuzzy matching suggestions)
  • qmd_multi_get - Retrieve multiple documents by glob pattern, list, or docids
  • qmd_status - Index health and collection info

Claude Desktop configuration (~/Library/Application Support/Claude/claude_desktop_config.json):

{
  "mcpServers": {
    "qmd": {
      "command": "qmd",
      "args": ["mcp"]
    }
  }
}

Claude Code configuration (~/.claude/settings.json):

{
  "mcpServers": {
    "qmd": {
      "command": "qmd",
      "args": ["mcp"]
    }
  }
}
┌─────────────────────────────────────────────────────────────────────────────┐
│                         QMD Hybrid Search Pipeline                          │
└─────────────────────────────────────────────────────────────────────────────┘

                              ┌─────────────────┐
                              │   User Query    │
                              └────────┬────────┘
                                       │
                        ┌──────────────┴──────────────┐
                        ▼                             ▼
               ┌────────────────┐            ┌────────────────┐
               │ Query Expansion│            │  Original Query│
               │   (Qwen3-1.7B) │            │   (×2 weight)  │
               └───────┬────────┘            └───────┬────────┘
                       │                             │
                       │ 2 alternative queries       │
                       └──────────────┬──────────────┘
                                      │
              ┌───────────────────────┼───────────────────────┐
              ▼                       ▼                       ▼
     ┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
     │ Original Query  │     │ Expanded Query 1│     │ Expanded Query 2│
     └────────┬────────┘     └────────┬────────┘     └────────┬────────┘
              │                       │                       │
      ┌───────┴───────┐       ┌───────┴───────┐       ┌───────┴───────┐
      ▼               ▼       ▼               ▼       ▼               ▼
  ┌───────┐       ┌───────┐ ┌───────┐     ┌───────┐ ┌───────┐     ┌───────┐
  │ BM25  │       │Vector │ │ BM25  │     │Vector │ │ BM25  │     │Vector │
  │(FTS5) │       │Search │ │(FTS5) │     │Search │ │(FTS5) │     │Search │
  └───┬───┘       └───┬───┘ └───┬───┘     └───┬───┘ └───┬───┘     └───┬───┘
      │               │         │             │         │             │
      └───────┬───────┘         └──────┬──────┘         └──────┬──────┘
              │                        │                       │
              └────────────────────────┼───────────────────────┘
                                       │
                                       ▼
                          ┌───────────────────────┐
                          │   RRF Fusion + Bonus  │
                          │  Original query: ×2   │
                          │  Top-rank bonus: +0.05│
                          │     Top 30 Kept       │
                          └───────────┬───────────┘
                                      │
                                      ▼
                          ┌───────────────────────┐
                          │    LLM Re-ranking     │
                          │  (qwen3-reranker)     │
                          │  Yes/No + logprobs    │
                          └───────────┬───────────┘
                                      │
                                      ▼
                          ┌───────────────────────┐
                          │  Position-Aware Blend │
                          │  Top 1-3:  75% RRF    │
                          │  Top 4-10: 60% RRF    │
                          │  Top 11+:  40% RRF    │
                          └───────────────────────┘

Score Normalization & Fusion

Backend Raw Score Conversion Range
FTS (BM25) SQLite FTS5 BM25 Math.abs(score) 0 to ~25+
Vector Cosine distance 1 / (1 + distance) 0.0 to 1.0
Reranker LLM 0-10 rating score / 10 0.0 to 1.0

The query command uses Reciprocal Rank Fusion (RRF) with position-aware blending:

  1. Query Expansion: Original query (×2 for weighting) + 1 LLM variation
  2. Parallel Retrieval: Each query searches both FTS and vector indexes
  3. RRF Fusion: Combine all result lists using score = Σ(1/(k+rank+1)) where k=60
  4. Top-Rank Bonus: Documents ranking #1 in any list get +0.05, #2-3 get +0.02
  5. Top-K Selection: Take top 30 candidates for reranking
  6. Re-ranking: LLM scores each document (yes/no with logprobs confidence)
  7. Position-Aware Blending:
    • RRF rank 1-3: 75% retrieval, 25% reranker (preserves exact matches)
    • RRF rank 4-10: 60% retrieval, 40% reranker
    • RRF rank 11+: 40% retrieval, 60% reranker (trust reranker more)

Why this approach: Pure RRF can dilute exact matches when expanded queries don't match. The top-rank bonus preserves documents that score #1 for the original query. Position-aware blending prevents the reranker from destroying high-confidence retrieval results.

Score Meaning
0.8 - 1.0 Highly relevant
0.5 - 0.8 Moderately relevant
0.2 - 0.5 Somewhat relevant
0.0 - 0.2 Low relevance
  • Bun >= 1.0.0
  • macOS: Homebrew SQLite (for extension support)

GGUF Models (via node-llama-cpp)

QMD uses three local GGUF models (auto-downloaded on first use):

Model Purpose Size
embeddinggemma-300M-Q8_0 Vector embeddings ~300MB
qwen3-reranker-0.6b-q8_0 Re-ranking ~640MB
Qwen3-1.7B-Q8_0 Query expansion ~2.2GB

Models are downloaded from HuggingFace and cached in ~/.cache/qmd/models/.

# Create a collection from current directory
qmd collection add . --name myproject

# Create a collection with explicit path and custom glob mask
qmd collection add ~/Documents/notes --name notes --mask "**/*.md"

# List all collections
qmd collection list

# Remove a collection
qmd collection remove myproject

# Rename a collection
qmd collection rename myproject my-project

# List files in a collection
qmd ls notes
qmd ls notes/subfolder

Generate Vector Embeddings

# Embed all indexed documents (800 tokens/chunk, 15% overlap)
qmd embed

# Force re-embed everything
qmd embed -f

Context adds descriptive metadata to collections and paths, helping search understand your content.

# Add context to a collection (using qmd:// virtual paths)
qmd context add qmd://notes "Personal notes and ideas"
qmd context add qmd://docs/api "API documentation"

# Add context from within a collection directory
cd ~/notes && qmd context add "Personal notes and ideas"
cd ~/notes/work && qmd context add "Work-related notes"

# Add global context (applies to all collections)
qmd context add / "Knowledge base for my projects"

# List all contexts
qmd context list

# Remove context
qmd context rm qmd://notes/old
┌──────────────────────────────────────────────────────────────────┐
│                        Search Modes                              │
├──────────┬───────────────────────────────────────────────────────┤
│ search   │ BM25 full-text search only                           │
│ vsearch  │ Vector semantic search only                          │
│ query    │ Hybrid: FTS + Vector + Query Expansion + Re-ranking  │
└──────────┴───────────────────────────────────────────────────────┘
# Full-text search (fast, keyword-based)
qmd search "authentication flow"

# Vector search (semantic similarity)
qmd vsearch "how to login"

# Hybrid search with re-ranking (best quality)
qmd query "user authentication"
# Search options
-n <num>           # Number of results (default: 5, or 20 for --files/--json)
-c, --collection   # Restrict search to a specific collection
--all              # Return all matches (use with --min-score to filter)
--min-score <num>  # Minimum score threshold (default: 0)
--full             # Show full document content
--line-numbers     # Add line numbers to output
--index <name>     # Use named index

# Output formats (for search and multi-get)
--files            # Output: docid,score,filepath,context
--json             # JSON output with snippets
--csv              # CSV output
--md               # Markdown output
--xml              # XML output

# Get options
qmd get <file>[:line]  # Get document, optionally starting at line
-l <num>               # Maximum lines to return
--from <num>           # Start from line number

# Multi-get options
-l <num>           # Maximum lines per file
--max-bytes <num>  # Skip files larger than N bytes (default: 10KB)

Default output is colorized CLI format (respects NO_COLOR env):

docs/guide.md:42 #a1b2c3
Title: Software Craftsmanship
Context: Work documentation
Score: 93%

This section covers the **craftsmanship** of building
quality software with attention to detail.
See also: engineering principles


notes/meeting.md:15 #d4e5f6
Title: Q4 Planning
Context: Personal notes and ideas
Score: 67%

Discussion about code quality and craftsmanship
in the development process.
  • Path: Collection-relative path (e.g., docs/guide.md)
  • Docid: Short hash identifier (e.g., #a1b2c3) - use with qmd get #a1b2c3
  • Title: Extracted from document (first heading or filename)
  • Context: Path context if configured via qmd context add
  • Score: Color-coded (green >70%, yellow >40%, dim otherwise)
  • Snippet: Context around match with query terms highlighted
# Get 10 results with minimum score 0.3
qmd query -n 10 --min-score 0.3 "API design patterns"

# Output as markdown for LLM context
qmd search --md --full "error handling"

# JSON output for scripting
qmd query --json "quarterly reports"

# Use separate index for different knowledge base
qmd --index work search "quarterly reports"
# Show index status and collections with contexts
qmd status

# Re-index all collections
qmd update

# Re-index with git pull first (for remote repos)
qmd update --pull

# Get document by filepath (with fuzzy matching suggestions)
qmd get notes/meeting.md

# Get document by docid (from search results)
qmd get "#abc123"

# Get document starting at line 50, max 100 lines
qmd get notes/meeting.md:50 -l 100

# Get multiple documents by glob pattern
qmd multi-get "journals/2025-05*.md"

# Get multiple documents by comma-separated list (supports docids)
qmd multi-get "doc1.md, doc2.md, #abc123"

# Limit multi-get to files under 20KB
qmd multi-get "docs/*.md" --max-bytes 20480

# Output multi-get as JSON for agent processing
qmd multi-get "docs/*.md" --json

# Clean up cache and orphaned data
qmd cleanup

Index stored in: ~/.cache/qmd/index.sqlite

collections     -- Indexed directories with name and glob patterns
path_contexts   -- Context descriptions by virtual path (qmd://...)
documents       -- Markdown content with metadata and docid (6-char hash)
documents_fts   -- FTS5 full-text index
content_vectors -- Embedding chunks (hash, seq, pos, 800 tokens each)
vectors_vec     -- sqlite-vec vector index (hash_seq key)
llm_cache       -- Cached LLM responses (query expansion, rerank scores)
Variable Default Description
XDG_CACHE_HOME ~/.cache Cache directory location
Collection ──► Glob Pattern ──► Markdown Files ──► Parse Title ──► Hash Content
    │                                                   │              │
    │                                                   │              ▼
    │                                                   │         Generate docid
    │                                                   │         (6-char hash)
    │                                                   │              │
    └──────────────────────────────────────────────────►└──► Store in SQLite
                                                                       │
                                                                       ▼
                                                                  FTS5 Index

Documents are chunked into 800-token pieces with 15% overlap:

Document ──► Chunk (800 tokens) ──► Format each chunk ──► node-llama-cpp ──► Store Vectors
                │                    "title | text"        embedBatch()
                │
                └─► Chunks stored with:
                    - hash: document hash
                    - seq: chunk sequence (0, 1, 2...)
                    - pos: character position in original
Query ──► LLM Expansion ──► [Original, Variant 1, Variant 2]
                │
      ┌─────────┴─────────┐
      ▼                   ▼
   For each query:     FTS (BM25)
      │                   │
      ▼                   ▼
   Vector Search      Ranked List
      │
      ▼
   Ranked List
      │
      └─────────┬─────────┘
                ▼
         RRF Fusion (k=60)
         Original query ×2 weight
         Top-rank bonus: +0.05/#1, +0.02/#2-3
                │
                ▼
         Top 30 candidates
                │
                ▼
         LLM Re-ranking
         (yes/no + logprob confidence)
                │
                ▼
         Position-Aware Blend
         Rank 1-3:  75% RRF / 25% reranker
         Rank 4-10: 60% RRF / 40% reranker
         Rank 11+:  40% RRF / 60% reranker
                │
                ▼
         Final Results

Models are configured in src/llm.ts as HuggingFace URIs:

const DEFAULT_EMBED_MODEL = "hf:ggml-org/embeddinggemma-300M-GGUF/embeddinggemma-300M-Q8_0.gguf";
const DEFAULT_RERANK_MODEL = "hf:ggml-org/Qwen3-Reranker-0.6B-Q8_0-GGUF/qwen3-reranker-0.6b-q8_0.gguf";
const DEFAULT_GENERATE_MODEL = "hf:ggml-org/Qwen3-1.7B-GGUF/Qwen3-1.7B-Q8_0.gguf";

EmbeddingGemma Prompt Format

// For queries
"task: search result | query: {query}"

// For documents
"title: {title} | text: {content}"

Uses node-llama-cpp's createRankingContext() and rankAndSort() API for cross-encoder reranking. Returns documents sorted by relevance score (0.0 - 1.0).

Used for generating query variations via LlamaChatSession.

MIT

联系我们 contact @ memedata.com