展示HN:具有生物衰减的人工智能记忆(52%召回率)
Show HN: AI memory with biological decay (52% recall)

原始链接: https://github.com/sachitrafa/YourMemory

## YourMemory:AI 智能体的持久记忆 YourMemory 解决了当前 AI 助手的一个关键限制——会话之间缺乏记忆。它提供了一个持久记忆层,模拟人类记忆:重要信息会被保留,而无关细节会随着时间推移而淡化。安装简单,只需两个命令,无需任何基础设施。 在 LoCoMo-10 基准测试中,YourMemory 显示出**比 Zep Cloud 高 2 倍的召回率 (59% vs 28%)**,使用了结合向量搜索、图扩展和受艾宾浩斯记忆曲线启发的衰减曲线的混合检索系统。记忆被分类(策略、事实、偏好等),以控制衰减速度。 YourMemory 通过标准的 MCP 服务器接口,与 Claude、Cline、Cursor 和 OpenCode 等流行的 AI 客户端无缝集成。它支持多个智能体,具有隔离的私有记忆和共享上下文,并由 API 密钥保护。 它使用 DuckDB 进行向量存储,使用 NetworkX 进行图连接构建,并可选择 PostgreSQL/pgvector 或 Neo4j 后端进行扩展。它自动处理记忆存储、更新和修剪,使 AI 智能体能够在对话中学习和保留信息。

## AI记忆与生物衰减:摘要 一篇Hacker News帖子(作者SachitRafa)详细介绍了一种新的AI内存管理方法,旨在解决RAG(检索增强生成)系统中“噪声”上下文窗口的问题。该实现不将内存视为静态档案,而是利用艾宾浩斯遗忘曲线——一种生物记忆衰减模型——来优先处理相关信息。 记忆被赋予“强度”评分,通过回忆(间隔重复)来强化,并在未使用时被修剪。这与向量存储上的图层相结合,以解决语义搜索无法找到逻辑相关数据的问题。在LoCoMo数据集上的基准测试显示,召回率@5为52%,是无状态向量存储的两倍,并且token浪费减少了84%。 讨论强调了对长期AI记忆价值的争论,一些人认为它会阻碍注意力和生产力。另一些人则提倡根据信息类型(个性与短期意图)使用不同衰减速率的类型化记忆。该项目使用DuckDB构建,表明对于长期运行的AI项目,知道“什么忘记”与“什么记住”同样重要。代码可在GitHub上找到:[https://github.com/sachitrafa/cognitive-ai-memory](https://github.com/sachitrafa/cognitive-ai-memory)。
相关文章

原文

Every session, your AI assistant starts from zero. It asks the same questions, forgets your preferences, re-learns your stack. There is no memory between conversations.

YourMemory fixes that. It gives AI agents a persistent memory layer that works the way human memory does — important things stick, forgotten things fade, outdated facts get replaced automatically. Two commands to install, zero infrastructure required.


Tested on LoCoMo-10 — 1,534 QA pairs across 10 multi-session conversations.

System Recall@5 95% CI
YourMemory (BM25 + vector + graph + decay) 59% 56–61%
Zep Cloud 28% 26–30%

2× better recall than Zep Cloud on the same benchmark.

Full methodology and per-sample breakdown in BENCHMARKS.md. Writeup: I built memory decay for AI agents using the Ebbinghaus forgetting curve.


YourMemory Demo


Supports Python 3.11, 3.12, 3.13, and 3.14. No Docker, no database setup, no external services.

Step 2 — Run setup (once)

Downloads the spaCy language model and initialises the local database at ~/.yourmemory/memories.duckdb.

Step 3 — Get your config path

Prints your full executable path and a ready-to-paste config block. Copy it.

Step 4 — Wire into your AI client

Claude Code

Add to ~/.claude/settings.json:

{
  "mcpServers": {
    "yourmemory": {
      "command": "yourmemory"
    }
  }
}

Reload (Cmd+Shift+PDeveloper: Reload Window).

Claude Desktop

Add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):

{
  "mcpServers": {
    "yourmemory": {
      "command": "yourmemory"
    }
  }
}

Restart Claude Desktop.

Cline (VS Code)

VS Code doesn't inherit your shell PATH. Run yourmemory-path first to get the full executable path.

In Cline → MCP ServersEdit MCP Settings:

{
  "mcpServers": {
    "yourmemory": {
      "command": "/full/path/to/yourmemory",
      "args": [],
      "env": { "YOURMEMORY_USER": "your_name" }
    }
  }
}

Restart Cline after saving.

Cursor

Add to ~/.cursor/mcp.json:

{
  "mcpServers": {
    "yourmemory": {
      "command": "/full/path/to/yourmemory",
      "args": [],
      "env": { "YOURMEMORY_USER": "your_name" }
    }
  }
}
OpenCode

Add to ~/.config/opencode/config.json:

{
  "mcp": {
    "yourmemory": {
      "type": "local",
      "command": ["yourmemory"],
      "environment": { "YOURMEMORY_USER": "your_name" }
    }
  }
}

Then copy the memory workflow instructions:

cp sample_CLAUDE.md ~/.config/opencode/instructions.md

Restart OpenCode.

Any MCP-compatible client: YourMemory is a standard stdio MCP server. Works with Windsurf, Continue, Zed, and any client that supports MCP. Use the full path from yourmemory-path if the client doesn't inherit shell PATH.

Step 5 — Add memory instructions to your project

cp sample_CLAUDE.md CLAUDE.md

Edit CLAUDE.md — replace YOUR_NAME and YOUR_USER_ID. Claude now follows the recall → store → update workflow automatically on every task.


Three tools. Called by Claude automatically once CLAUDE.md is in place.

Tool When What it does
recall_memory(query) Start of every task Surfaces relevant memories ranked by similarity × strength
store_memory(content, importance) After learning something new Embeds and stores with biological decay
update_memory(id, new_content) When a memory is outdated Re-embeds and replaces
# Example session
store_memory("Sachit prefers tabs over spaces in Python", importance=0.9, category="fact")

# Next session — without being told again:
recall_memory("Python formatting")
# → {"content": "Sachit prefers tabs over spaces in Python", "strength": 0.87}

Categories control how fast memories fade

Category Survives without recall Use case
strategy ~38 days Successful patterns
fact ~24 days Preferences, identity
assumption ~19 days Inferred context
failure ~11 days Errors, environment-specific issues

Ebbinghaus Forgetting Curve

Memory strength decays exponentially — but importance and recall frequency slow that decay:

effective_λ = base_λ × (1 - importance × 0.8)
strength    = importance × e^(−effective_λ × days) × (1 + recall_count × 0.2)
score       = cosine_similarity × strength

Memories recalled frequently resist decay. Memories below strength 0.05 are pruned automatically every 24 hours.

Hybrid Retrieval: Vector + Graph

Retrieval runs in two rounds to surface related context that vocabulary-based search misses:

Round 1 — Vector search: cosine similarity against all memories, returns top-k above threshold.

Round 2 — Graph expansion: BFS traversal from Round 1 seeds surfaces memories that share context but not vocabulary — connected via semantic edges (cosine similarity ≥ 0.4).

recall("Python backend")
  Round 1 → [1] Python/MongoDB    (sim=0.61)
             [2] DuckDB/spaCy     (sim=0.19)
  Round 2 → [5] Docker/Kubernetes (sim=0.29 — below cut-off, surfaced via graph)

Chain-aware pruning: A decayed memory is kept alive if any graph neighbour is above the prune threshold. Related memories age together.


Multiple agents can share the same YourMemory instance — each with isolated private memories and controlled access to shared context.

from src.services.api_keys import register_agent

result = register_agent(
    agent_id="coding-agent",
    user_id="sachit",
    can_read=["shared", "private"],
    can_write=["shared", "private"],
)
# → result["api_key"]  — ym_xxxx, shown once only

Pass api_key to any MCP call to authenticate as an agent:

store_memory(content="Staging uses self-signed cert — skip SSL verify",
             importance=0.7, category="failure",
             api_key="ym_xxxx", visibility="private")

recall_memory(query="staging SSL", api_key="ym_xxxx")
# → returns shared memories + this agent's private memories
# → other agents see shared only

Component Role
DuckDB Default vector DB — zero setup, native cosine similarity
NetworkX Default graph backend — persists at ~/.yourmemory/graph.pkl
sentence-transformers Local embeddings (all-mpnet-base-v2, 768 dims)
spaCy Local NLP for deduplication and SVO triple extraction
APScheduler Automatic 24h decay job
PostgreSQL + pgvector Optional — for teams or large datasets
Neo4j Optional graph backend — pip install 'yourmemory[neo4j]'
PostgreSQL setup (optional)
pip install yourmemory[postgres]

Create a .env file:

DATABASE_URL=postgresql://YOUR_USER@localhost:5432/yourmemory

macOS

brew install postgresql@16 pgvector && brew services start postgresql@16
createdb yourmemory

Ubuntu / Debian

sudo apt install postgresql postgresql-contrib postgresql-16-pgvector
createdb yourmemory

Claude / Cline / Cursor / Any MCP client
    │
    ├── recall_memory(query, api_key?)
    │       └── embed → vector similarity (Round 1)
    │               → graph BFS expansion  (Round 2)
    │               → score = sim × strength → top-k
    │               → recall propagation → boost neighbours
    │
    ├── store_memory(content, importance, category?, visibility?, api_key?)
    │       └── question? → reject
    │               contradiction check → update if conflict
    │               embed() → INSERT → index_memory() → graph node + edges
    │
    └── update_memory(id, new_content, importance)
            └── embed(new_content) → UPDATE → refresh graph node

  Vector DB (Round 1)             Graph DB (Round 2)
  DuckDB (default)                NetworkX (default)
    memories.duckdb                 graph.pkl
    ├── embedding FLOAT[768]        ├── nodes: memory_id, strength
    ├── importance FLOAT            └── edges: sim × verb_weight ≥ 0.4
    ├── recall_count INTEGER
    ├── visibility VARCHAR        Neo4j (opt-in)
    └── agent_id VARCHAR            └── bolt://localhost:7687

Benchmarks use the LoCoMo dataset by Snap Research.

Maharana et al. (2024). LoCoMo: Long Context Multimodal Benchmark for Dialogue. Snap Research.


Copyright 2026 Sachit Misra — Licensed under CC-BY-NC-4.0.

Free for: personal use, education, academic research, open-source projects.
Not permitted: commercial use without a separate written agreement.

Commercial licensing: [email protected]

联系我们 contact @ memedata.com