展示HN：具有生物衰减的人工智能记忆（52%召回率）

展示HN：具有生物衰减的人工智能记忆（52%召回率）
Show HN: AI memory with biological decay (52% recall)

原始链接: https://github.com/sachitrafa/YourMemory

## YourMemory：AI 智能体的持久记忆 YourMemory 解决了当前 AI 助手的一个关键限制——会话之间缺乏记忆。它提供了一个持久记忆层，模拟人类记忆：重要信息会被保留，而无关细节会随着时间推移而淡化。安装简单，只需两个命令，无需任何基础设施。在 LoCoMo-10 基准测试中，YourMemory 显示出**比 Zep Cloud 高 2 倍的召回率 (59% vs 28%)**，使用了结合向量搜索、图扩展和受艾宾浩斯记忆曲线启发的衰减曲线的混合检索系统。记忆被分类（策略、事实、偏好等），以控制衰减速度。 YourMemory 通过标准的 MCP 服务器接口，与 Claude、Cline、Cursor 和 OpenCode 等流行的 AI 客户端无缝集成。它支持多个智能体，具有隔离的私有记忆和共享上下文，并由 API 密钥保护。它使用 DuckDB 进行向量存储，使用 NetworkX 进行图连接构建，并可选择 PostgreSQL/pgvector 或 Neo4j 后端进行扩展。它自动处理记忆存储、更新和修剪，使 AI 智能体能够在对话中学习和保留信息。

## AI记忆与生物衰减：摘要一篇Hacker News帖子（作者SachitRafa）详细介绍了一种新的AI内存管理方法，旨在解决RAG（检索增强生成）系统中“噪声”上下文窗口的问题。该实现不将内存视为静态档案，而是利用艾宾浩斯遗忘曲线——一种生物记忆衰减模型——来优先处理相关信息。记忆被赋予“强度”评分，通过回忆（间隔重复）来强化，并在未使用时被修剪。这与向量存储上的图层相结合，以解决语义搜索无法找到逻辑相关数据的问题。在LoCoMo数据集上的基准测试显示，召回率@5为52%，是无状态向量存储的两倍，并且token浪费减少了84%。讨论强调了对长期AI记忆价值的争论，一些人认为它会阻碍注意力和生产力。另一些人则提倡根据信息类型（个性与短期意图）使用不同衰减速率的类型化记忆。该项目使用DuckDB构建，表明对于长期运行的AI项目，知道“什么忘记”与“什么记住”同样重要。代码可在GitHub上找到：[https://github.com/sachitrafa/cognitive-ai-memory](https://github.com/sachitrafa/cognitive-ai-memory)。

原文

Every session, your AI assistant starts from zero. It asks the same questions, forgets your preferences, re-learns your stack. There is no memory between conversations.

YourMemory fixes that. It gives AI agents a persistent memory layer that works the way human memory does — important things stick, forgotten things fade, outdated facts get replaced automatically. Two commands to install, zero infrastructure required.

Tested on LoCoMo-10 — 1,534 QA pairs across 10 multi-session conversations.

System	Recall@5	95% CI
YourMemory (BM25 + vector + graph + decay)	59%	56–61%
Zep Cloud	28%	26–30%

2× better recall than Zep Cloud on the same benchmark.

Full methodology and per-sample breakdown in BENCHMARKS.md. Writeup: I built memory decay for AI agents using the Ebbinghaus forgetting curve.

Supports Python 3.11, 3.12, 3.13, and 3.14. No Docker, no database setup, no external services.

Step 2 — Run setup (once)

Downloads the spaCy language model and initialises the local database at ~/.yourmemory/memories.duckdb.

Step 3 — Get your config path

Prints your full executable path and a ready-to-paste config block. Copy it.

Step 4 — Wire into your AI client

Claude Code

Add to ~/.claude/settings.json:

{
  "mcpServers": {
    "yourmemory": {
      "command": "yourmemory"
    }
  }
}

Reload (Cmd+Shift+P → Developer: Reload Window).

Claude Desktop

Add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):

{
  "mcpServers": {
    "yourmemory": {
      "command": "yourmemory"
    }
  }
}

Restart Claude Desktop.

Cline (VS Code)

VS Code doesn't inherit your shell PATH. Run yourmemory-path first to get the full executable path.

In Cline → MCP Servers → Edit MCP Settings:

{
  "mcpServers": {
    "yourmemory": {
      "command": "/full/path/to/yourmemory",
      "args": [],
      "env": { "YOURMEMORY_USER": "your_name" }
    }
  }
}

Restart Cline after saving.

Cursor

Add to ~/.cursor/mcp.json:

{
  "mcpServers": {
    "yourmemory": {
      "command": "/full/path/to/yourmemory",
      "args": [],
      "env": { "YOURMEMORY_USER": "your_name" }
    }
  }
}

OpenCode

Add to ~/.config/opencode/config.json:

{
  "mcp": {
    "yourmemory": {
      "type": "local",
      "command": ["yourmemory"],
      "environment": { "YOURMEMORY_USER": "your_name" }
    }
  }
}

Then copy the memory workflow instructions:

cp sample_CLAUDE.md ~/.config/opencode/instructions.md

Restart OpenCode.

Any MCP-compatible client: YourMemory is a standard stdio MCP server. Works with Windsurf, Continue, Zed, and any client that supports MCP. Use the full path from yourmemory-path if the client doesn't inherit shell PATH.

Step 5 — Add memory instructions to your project

cp sample_CLAUDE.md CLAUDE.md

Edit CLAUDE.md — replace YOUR_NAME and YOUR_USER_ID. Claude now follows the recall → store → update workflow automatically on every task.

Three tools. Called by Claude automatically once CLAUDE.md is in place.

Tool	When	What it does
`recall_memory(query)`	Start of every task	Surfaces relevant memories ranked by similarity × strength
`store_memory(content, importance)`	After learning something new	Embeds and stores with biological decay
`update_memory(id, new_content)`	When a memory is outdated	Re-embeds and replaces

# Example session
store_memory("Sachit prefers tabs over spaces in Python", importance=0.9, category="fact")

# Next session — without being told again:
recall_memory("Python formatting")
# → {"content": "Sachit prefers tabs over spaces in Python", "strength": 0.87}

Categories control how fast memories fade

Category	Survives without recall	Use case
`strategy`	~38 days	Successful patterns
`fact`	~24 days	Preferences, identity
`assumption`	~19 days	Inferred context
`failure`	~11 days	Errors, environment-specific issues

Ebbinghaus Forgetting Curve

Memory strength decays exponentially — but importance and recall frequency slow that decay:

effective_λ = base_λ × (1 - importance × 0.8)
strength    = importance × e^(−effective_λ × days) × (1 + recall_count × 0.2)
score       = cosine_similarity × strength

Memories recalled frequently resist decay. Memories below strength 0.05 are pruned automatically every 24 hours.

Hybrid Retrieval: Vector + Graph

Retrieval runs in two rounds to surface related context that vocabulary-based search misses:

Round 1 — Vector search: cosine similarity against all memories, returns top-k above threshold.

Round 2 — Graph expansion: BFS traversal from Round 1 seeds surfaces memories that share context but not vocabulary — connected via semantic edges (cosine similarity ≥ 0.4).

recall("Python backend")
  Round 1 → [1] Python/MongoDB    (sim=0.61)
             [2] DuckDB/spaCy     (sim=0.19)
  Round 2 → [5] Docker/Kubernetes (sim=0.29 — below cut-off, surfaced via graph)

Chain-aware pruning: A decayed memory is kept alive if any graph neighbour is above the prune threshold. Related memories age together.

Multiple agents can share the same YourMemory instance — each with isolated private memories and controlled access to shared context.

from src.services.api_keys import register_agent

result = register_agent(
    agent_id="coding-agent",
    user_id="sachit",
    can_read=["shared", "private"],
    can_write=["shared", "private"],
)
# → result["api_key"]  — ym_xxxx, shown once only

Pass api_key to any MCP call to authenticate as an agent:

store_memory(content="Staging uses self-signed cert — skip SSL verify",
             importance=0.7, category="failure",
             api_key="ym_xxxx", visibility="private")

recall_memory(query="staging SSL", api_key="ym_xxxx")
# → returns shared memories + this agent's private memories
# → other agents see shared only

Component	Role
DuckDB	Default vector DB — zero setup, native cosine similarity
NetworkX	Default graph backend — persists at `~/.yourmemory/graph.pkl`
sentence-transformers	Local embeddings (`all-mpnet-base-v2`, 768 dims)
spaCy	Local NLP for deduplication and SVO triple extraction
APScheduler	Automatic 24h decay job
PostgreSQL + pgvector	Optional — for teams or large datasets
Neo4j	Optional graph backend — `pip install 'yourmemory[neo4j]'`

PostgreSQL setup (optional)

pip install yourmemory[postgres]

Create a .env file:

DATABASE_URL=postgresql://YOUR_USER@localhost:5432/yourmemory

macOS

brew install postgresql@16 pgvector && brew services start postgresql@16
createdb yourmemory

Ubuntu / Debian

sudo apt install postgresql postgresql-contrib postgresql-16-pgvector
createdb yourmemory

Claude / Cline / Cursor / Any MCP client
    │
    ├── recall_memory(query, api_key?)
    │       └── embed → vector similarity (Round 1)
    │               → graph BFS expansion  (Round 2)
    │               → score = sim × strength → top-k
    │               → recall propagation → boost neighbours
    │
    ├── store_memory(content, importance, category?, visibility?, api_key?)
    │       └── question? → reject
    │               contradiction check → update if conflict
    │               embed() → INSERT → index_memory() → graph node + edges
    │
    └── update_memory(id, new_content, importance)
            └── embed(new_content) → UPDATE → refresh graph node

  Vector DB (Round 1)             Graph DB (Round 2)
  DuckDB (default)                NetworkX (default)
    memories.duckdb                 graph.pkl
    ├── embedding FLOAT[768]        ├── nodes: memory_id, strength
    ├── importance FLOAT            └── edges: sim × verb_weight ≥ 0.4
    ├── recall_count INTEGER
    ├── visibility VARCHAR        Neo4j (opt-in)
    └── agent_id VARCHAR            └── bolt://localhost:7687

Benchmarks use the LoCoMo dataset by Snap Research.

Maharana et al. (2024). LoCoMo: Long Context Multimodal Benchmark for Dialogue. Snap Research.

Free for: personal use, education, academic research, open-source projects.
Not permitted: commercial use without a separate written agreement.

Commercial licensing: [email protected]