15万行充满氛围的代码风格的 Elixir：好的、坏的和丑的。

15万行充满氛围的代码风格的 Elixir：好的、坏的和丑的。
150k lines of vibe coded Elixir: The good, the bad and the ugly

原始链接: https://getboothiq.com/blog/150k-lines-vibe-coded-elixir-good-bad-ugly

## AI驱动的Elixir开发：总结 BoothIQ，一种展会徽章扫描仪，完全使用AI生成的Elixir代码构建（15万行！）。虽然生产力极高，但经验揭示了其优势和劣势。 **优点：** Elixir体积小，语法简洁，非常适合AI，减少决策并最大限度地利用有限的AI“记忆”中的上下文。像Tidewave这样的工具，提供对正在运行的应用程序日志和数据库模式的访问，进一步提高准确性并减少“幻觉”。AI在前端工作方面表现出色，可以快速实现设计更改并提高代码质量。 **缺点：** AI倾向于使用防御性、命令式编码风格，这种风格常见于Ruby和JavaScript等语言，需要不断纠正以保持惯用的Elixir风格。 **问题：** AI难以处理并发和测试。它不理解Elixir的进程隔离或事务性测试，导致调试陷入僵局。它也缺乏架构愿景，经常创建冗余文件和不一致的代码。尽管存在这些缺点，生产力提升仍然很大。成功的关键在于维护一致的代码库架构，并积极引导AI采用良好的Elixir实践。最终目标是自动化整个开发生命周期，最大限度地减少人工干预。

## Hacker News 讨论：15 万行“氛围编码”的 Elixir 代码一场 Hacker News 讨论围绕着一篇博客文章展开，该文章详细介绍了使用 AI（特别是 Claude）生成大量（15 万行）Elixir 代码构建系统的经验。作者认为这个过程富有成效，但对话很快转向了以代码行数衡量生产力的优缺点——这种做法被广泛认为是有缺陷的。许多评论者质疑如此庞大的代码库的效率，认为存在冗余，并强调了长期以来对代码行数作为指标的批评。其他人分享了使用 AI 辅助编码的经验，指出存在代码不符合惯例、过度防御以及难以处理异步操作和测试等复杂任务的问题。几位用户强调了明确的提示、 “技能”（为 AI 预定义的指令）和上下文管理的重要性，以减轻这些问题。关于当前的 LLM 是否会显著改进，或者它们的用处是否已经达到瓶颈，以及不断上涨的成本是否会超过收益，存在争论。尽管存在挑战，一些开发者报告在使用 AI 方面获得了显著的速度提升，特别是使用 Tidewave 等工具和精心设计的提示。

原文

January 05, 2026 • John

TL;DR:

Good: AI is great at Elixir. It gets better as your codebase grows.
Bad: It defaults to defensive, imperative code. You need to be strict about what good Elixir looks like.
Ugly: It can’t debug concurrent test failures. It doesn’t understand that each test runs in an isolated transaction, or that processes have independent lifecycles. It spirals until you step in.
Bottom Line: Even with the drawbacks, the productivity gains are off the charts. I expect it will only get better.

BoothIQ is a universal badge scanner for trade shows. AI writes 100% of our code. We have 150,000 lines of vibe coded Elixir running in production. Here’s what worked and what didn’t.

Elixir is Small: It Gets It Right the First Time

Elixir is a small language. Few operators. Small standard library. Only so many ways to control flow. It hasn’t been around for decades. It hasn’t piled up paradigms like .NET or Java, where functional and OOP fight for space.

This matters. AI is bad at decisions. If you want your agent to succeed, have it make fewer decisions. With Elixir, Claude doesn’t need to pick between OOP and functional. It doesn’t need to navigate old syntax next to new patterns. There’s one way to skin the cat. Claude finds it.

This matters more if you’re adding AI to an existing codebase. In languages where paradigms came and went—often with whatever developer pushed them—Claude tries to match the existing code. The existing code is inconsistent. So Claude is inconsistent.

Elixir is Terse: Longer Sessions, Fewer Compactions

Small and terse are related but different. Small means few concepts. Terse means fewer tokens to express the same thing. Go is small but not terse—few concepts, but verbose syntax and explicit error handling everywhere. Elixir is both. We got lucky.

Context windows are a real constraint. Elixir uses fewer tokens than most languages. No braces. No semicolons. No verbose boilerplate. I can stay in a working session longer. More iterations. Fewer compactions—those moments when the AI summarizes and forgets earlier context. More context in memory.

When I built the React Native version of our app, I hit compactions constantly. JavaScript is small-ish, but it’s not terse. It burns tokens to do what Elixir does with fewer.

I also see more compactions when working on heavy HTML and Tailwind in LiveView. Adding, updating, or editing large sections of markup at once. HTML and HEEx templates are token-heavy. But even then, it’s less painful than JavaScript-heavy work.

Tidewave: Longer Unassisted Runs

Tidewave supercharges Elixir-specific context. It lets the agent read logs from the running app—debug, info, error, warning—so you don’t copy/paste logs around. It can query the dev database, see Ecto schemas, and view package documentation. Fewer hallucinations. Longer unassisted runs. The agent can check and validate its own assumptions without human intervention.

Immutability: Fewer Decisions, Less Code

If a variable gets mutated by a function call, AI now has three problems instead of one. The actual feature you want implemented. Whether to work around the mutation or update other call sites to stop mutating. And the mutated data itself—what is it, what was it, what will it be, what can it be?

AI ponders all of this and contorts itself into an overly defensive mess. It writes nonsense validation checks and if-statements on mutated data. Defensive code that wouldn’t exist in an immutable language.

In Elixir, the data is what it is. It’s not going to change. Fewer decisions. Less code.

Frontend: Higher Quality, Less Time

I prompt high-level changes—“give the top section more padding”—and Claude does it faster than I could. It’s especially good at modifying or moving large chunks of page structure. Mobile-first views? Easy. Way faster than me, and it’s a better designer than me too.

The quality floor has gone way up. You can’t hide behind “I’m not a designer” anymore.

Git Worktrees: Build Multiple Features in Parallel

I use three git worktrees, so I can work on up to three features at any given time. Typically a main feature, a slightly less important one, and the third reserved for quick fixes, low priority stuff, or quick experiments.

Three is about the limit. Any more and context switching between features becomes the bottleneck.

AI Can’t Organize: Architecture Is Still On You

AI is exceptional at churning out lines of code. It’s significantly less exceptional at deciding where those lines should go. It defaults to creating new files everywhere. It repeats code it’s already written. It introduces inconsistencies.

This is the “mess” people describe in vibe code projects as they grow. You still need a human making structural decisions.

Trained on Imperative: It Writes Defensive Code

AI trained mostly on imperative code. Ruby, Python, JavaScript, C#. Elixir looks like Ruby. So Claude writes Ruby-style Elixir—if/then/else chains, defensive nil-checking, early returns that don’t make sense in a functional context.

Elixir wants you to be assertive. Pattern match on what you expect. Let it crash if something’s wrong. The process restarts in a good state. This is foreign to most code Claude trained on.

This gets better as the codebase grows. Claude sees more assertive patterns. It starts to infer the style. But it still defaults to defensive. I still correct it regularly. Be strict about what good Elixir looks like.

Git Operations: Keep It Out of Context

Every git operation takes context window space. Checking status. Writing commit messages. Describing PRs. That space could go to actual work. Git context goes stale fast—a commit message from 20 minutes ago is worthless after three more changes.

When I’m babysitting a feature, I commit manually. Every point I’m happy with. It’s fast. It’s cheap version control. It doesn’t burn context.

Claude Code has “checkpoints” now. Internal version control that protects vibe coders without explicit commits. That’s better than AI managing git directly.

OTP and Async: It Chases Ghosts

Claude is useless for debugging OTP, Task, or async issues. It doesn’t understand how processes, the actor model, and GenServers work together. When it tries to introspect the running system, it feeds itself bad data. It gets very lost.

It can course correct when you point out where it went wrong. But on its own, it chases ghosts.

Ecto Sandbox: It Chases Red Herrings

In Elixir tests, each test runs in a database transaction that rolls back at the end. Tests run async without hitting each other. No test data persists.

Claude doesn’t understand this. It uses Tidewave’s dev DB connection and thinks it’s looking at the test DB—which is always empty. A test fails. Claude queries the database. Finds nothing. Thinks there’s a data problem.

I’ve watched Claude try to seed the test database so a test will pass. That’s clearly wrong.

Other times, two tests insert or query the same schema. Claude doesn’t understand transaction isolation—tests can’t see each other’s data. It confuses itself and recommends disabling async tests altogether. Manageable once you watch for it. But ugly.

AI writing all the code has been a massive win. The friction exists, but it’s manageable and doesn’t interfere much with day-to-day work. By far the most important thing: have a consistent, coherent codebase architecture. Without it, you’ll quickly end up with spaghetti code.

The goal for this year: automate myself out of a job. That means giving Claude more control over the entire software development lifecycle—from a simple problem statement to a fully tested, working PR that only needs a quick glance before it’s merged and deployed.