纳米聊

纳米聊
Nanochat

原始链接: https://simonwillison.net/2025/Oct/13/nanochat/

安德烈·卡帕西发布了“nanochat”，一个构建和运行ChatGPT风格LLM的极易上手项目。这个完整的实现——训练、推理和Web UI——拥有最小化的代码库，大约8000行，主要使用Python (PyTorch)，并包含一些Rust。令人印象深刻的是，一个对话模型仅需100美元就可以使用租赁的NVIDIA硬件进行训练（在8XH100节点上大约4小时）。经过12小时的训练，它的性能可以与GPT-2相媲美。由此产生的5.61亿参数模型足够小，可以在适度的硬件上运行，甚至可以在树莓派或，如演示所示，iPhone上运行。该项目利用一个策划的数据集，结合了FineWeb-Edu、SmolTalk、MMLU和GSM8K用于训练。预训练模型可在Hugging Face上获得（sdobson/nanochat），并且已经开发了与CPU兼容的脚本，供macOS用户使用，从而扩大了实验的可访问性。初步测试显示出有希望且连贯的响应。

## NanoLLM：用于学习和实验的微型人工智能最近一篇Hacker News上的帖子讨论了NanoLLM，这是一个由人工智能研究员Andrej Karpathy创建的小型语言模型。NanoLLM利用100美元的云计算时间构建，展示了人工智能可访问性的进步——它甚至可以在适度的硬件上运行，尽管速度较慢。虽然该模型的输出无法与ChatGPT或Claude等工具相提并论，但它的价值在于其教育潜力。Karpathy发布了代码，允许其他人试验构建和训练自己的LLM。讨论的重点是NanoLLM是否可以用于定制的SaaS应用程序。专家警告不要为特定领域训练定制模型，建议使用检索增强生成（RAG）或代理搜索作为更实用的解决方案。即使如此，来自OpenAI等供应商的API成本通常低于独立运行模型的费用。然而，NanoLLM仍然是一个有价值的学习工具，也是该领域快速进步的证明。

原文

nanochat (via) Really interesting new project from Andrej Karpathy, described at length in this discussion post.

It provides a full ChatGPT-style LLM, including training, inference and a web Ui, that can be trained for as little as $100:

This repo is a full-stack implementation of an LLM like ChatGPT in a single, clean, minimal, hackable, dependency-lite codebase.

It's around 8,000 lines of code, mostly Python (using PyTorch) plus a little bit of Rust for training the tokenizer.

Andrej suggests renting a 8XH100 NVIDA node for around $24/ hour to train the model. 4 hours (~$100) is enough to get a model that can hold a conversation - almost coherent example here. Run it for 12 hours and you get something that slightly outperforms GPT-2. I'm looking forward to hearing results from longer training runs!

The resulting model is ~561M parameters, so it should run on almost anything. I've run a 4B model on my iPhone, 561M should easily fit on even an inexpensive Raspberry Pi.

The model defaults to training on ~24GB from karpathy/fineweb-edu-100b-shuffle derived from FineWeb-Edu, and then midtrains on 568K examples from SmolTalk (460K), MMLU auxiliary train (100K), and GSM8K (8K), followed by supervised finetuning on 21.4K examples from ARC-Easy (2.3K), ARC-Challenge (1.1K), GSM8K (8K), and SmolTalk (10K).

Here's the code for the web server, which is fronted by this pleasantly succinct vanilla JavaScript HTML+JavaScript frontend.

Update: Sam Dobson pushed a build of the model to sdobson/nanochat on Hugging Face. It's designed to run on CUDA but I pointed Claude Code at a checkout and had it hack around until it figured out how to run it on CPU on macOS, which eventually resulted in this script which I've published as a Gist. You should be able to try out the model using uv like this:

cd /tmp
git clone https://huggingface.co/sdobson/nanochat
uv run https://gist.githubusercontent.com/simonw/912623bf00d6c13cc0211508969a100a/raw/80f79c6a6f1e1b5d4485368ef3ddafa5ce853131/generate_cpu.py \
--model-dir /tmp/nanochat \
--prompt "Tell me about dogs."

I got this (truncated because it ran out of tokens):

I'm delighted to share my passion for dogs with you. As a veterinary doctor, I've had the privilege of helping many pet owners care for their furry friends. There's something special about training, about being a part of their lives, and about seeing their faces light up when they see their favorite treats or toys.

I've had the chance to work with over 1,000 dogs, and I must say, it's a rewarding experience. The bond between owner and pet

纳米聊 Nanochat

纳米聊
Nanochat