通过过度思考、范围蔓延和结构差异破坏项目。
Sabotaging projects by overthinking, scope creep, and structural diffing

原始链接: https://kevinlynagh.com/newsletter/2026_04_overthinking/

## 优先完成,而非追求完美 作者反思了一个反复出现的模式:项目会在过度思考和范围蔓延时停滞不前,这与快速完成简单、专注的任务所带来的满足感形成对比。他们提倡拥抱“立刻行动”的心态,在开始之前明确定义成功标准,并避免陷入无休止的研究或追求过于雄心勃勃的解决方案的陷阱。 最近的例子说明了这一点:与朋友一起成功完成的周末木工项目,其动力是简单的建造一个书架的目标,而对复杂的语义差异工具的研究却陷入了困境。差异研究最初源于一个小小的烦恼,却演变成数小时的调查,直到作者重新专注于一个最小化的解决方案——为自己的Emacs工作流程构建一个基本的工具。 他们承认学习和优化的吸引力,但强调*发布*某件事物的价值,即使它是不完美的。这包括认识到使用LLM生成代码的陷阱,这很容易导致功能膨胀。核心要点是优先完成项目,而不是无休止地计划或追逐“完美”的解决方案,记住有时候,“仅仅一个书架”就足够了。他们将参加Babashka Conf和Dutch Clojure Days,并欢迎与他人交流。

Hacker News上最近的一篇帖子讨论了通过过度思考、扩大范围和沉迷于细节比较(“结构差异分析”)来“破坏”项目的现象。这篇文章引起了评论者的共鸣,特别是有一位博士生描述了详尽的研究如何导致范围蔓延和动力下降,使项目的最后阶段变得非常困难。 虽然有些人觉得原文组织混乱,但核心观点——过度分析和功能蔓延会阻碍进展——引起了大家的共鸣。 讨论强调了一个常见的困境:项目最初的热情可能会被过度复杂化和失去重点的倾向所削弱,最终阻碍完成。 这篇帖子还包含了一个Y Combinator申请公告。
相关文章

原文

Hi friends,

I’ll be attending Babashka Conf on May 8 and Dutch Clojure Days on May 9. If you’re attending either (or just visiting Amsterdam), drop me a line!

When I have an idea for a project, it tends to go in one of these two directions:

  1. I just do it. Maybe I make a few minor revisions, but often it turns out exactly how I’d imagined and I’m happy.

  2. I think, “I should look for prior art”. There’s a lot of prior art, dealing with a much broader scope than I’d originally imagined. I start to wonder if I should incorporate that scope. Or perhaps try to build my thing on top of the existing sorta-nearby-solutions. Or maybe I should just use the popular thing. Although I could do a better job than that thing, if I put a bunch of time into it. But actually, I don’t want to maintain a big popular project, nor do I want to put that much time into this project. Uh oh, now I’ve spent a bunch of time, having neither addressed the original issue nor experienced the joy of creating something.

I prefer the first outcome, and I think the pivotal factor is how well I’ve internalized my own success criteria.

For example, last weekend I hosted my friend Marcin and we decided it’d be fun to do some woodworking, so we threw together this shelf and 3d-printed hangers for my kitchen:

a black shelf with a painted orange/pink edge and Ikea food bins hanging off the bottom

Absolute banger of a project:

  • brainstormed the design over coffee
  • did a few 3d-print iterations for the Ikea bin hangers (OnShape CAD, if you want to print your own)
  • used material leftover from my workbench
  • rounded the corner by eye with a palm sander
  • sealed the raw plywood edge with some leftover paint from a friend
  • done in a weekend

The main success criteria was to jam on woodworking with a friend, and that helped me not overthink the object-level success criteria: Just make a shelf for my exact kitchen!

In contrast, this past Friday I noticed difftastic did a poor job, so I decided to shop around for structural/semantic diff tools and related workflows (a topic I’ve never studied, that I’m increasingly interested in as I’m reviewing more and more LLM-generated code).

I spent 4 hours over the weekend researching existing tools (see my notes below), going through dark periods of both “semantic tree diffing is a PhD-level complex problem” and “why do all of these have MCP servers? I don’t want an MCP server”, before I came to my senses and remembered my original success criteria: I just want a nicer diffing workflow for myself in Emacs, I should just build it myself — should take about 4 hours.

I’m cautiously optimistic that, having had this realization and committing myself to a minimal scope, I’ll be able to knock out a prototype before running out of motivation.

However, other long-running interests of mine:

seem to be deep in the well of outcome #2.

That is, I’ve spent hundreds of hours on background research and little prototypes, but haven’t yet synthesized anything that addresses the original motivating issue.

It’s not quite that I regret that time — I do love learning by reading — but I have a nagging sense of unease that my inner critic (fear of failure?) is silencing my generative tendencies, keeping me from the much more enjoyable (and productive!) learning by doing.

I think in these cases the success criteria has been much fuzzier: Am I trying to replace my own usage of Rust/Clojure? Only for some subset of problems? Or is it that I actually just need a playground to learn about language design/implementation, and it’s fine if I don’t end up using it?

Ditto for CAD: Am I trying to replace my commercial CAD tool in favor of my own? Only for some subset of simple or particularly parametric parts? Do I care if it’s useful for others? Does my tool need to be legibly different from existing open-source tools?

It’s worth considering these questions, sure. But at the end of the day, I’d much rather have done a lot than have only considered a lot.

So I’m trying to embrace my inner clueless 20-year-old and just do things — even if some turn out to be “obviously bad” in hindsight, I’ll still be coming out ahead on net =D

Of course, there’s only so much time to “just do things”, and there’s a balance to be had. I’m not sure how many times I’ll re-learn YAGNI (“you ain’t gonna need it”) in my career, but I was reminded of it again after writing a bunch of code with an LLM agent, then eventually coming to my senses and throwing it all out.

I wanted a Finda-style filesystem-wide fuzzy path search for Emacs. Since I’ve built (by hand, typing the code myself!) this exact functionality before (walk filesystem to collect paths, index them by trigram, do fast fuzzy queries via bitmap intersections), I figured it’d only take a few hours to supervise an LLM to write all the code.

I started with a “plan mode” chat, and the LLM suggested a library, Nucleo, which turned up since I wrote Finda (10 years ago, eek!). I read through it, found it quite well-designed and documented, and decided to use it so I’d get its smart case and Unicode normalization functionality. (E.g., query foo matches Foo and foo, whereas query Foo won’t match foo; similarly for cafe and café.)

Finding a great library wasn’t the problem, the problem was that Nucleo also supported some extra functionality: anchors (^foo only matches at the beginning of a line).

This got me thinking about what that might mean in a corpus that consists entirely of file paths. Anchoring to the beginning of a line isn’t useful (everything starts with /), so I decided to try and interpret the anchors with respect to the path segments. E.g., ^foo would match /root/foobar/ but not /root/barfoo/.

But to do this efficiently, the index needs to keep track of segment boundaries so that the query can be checked against each segment quickly.

But then we also need to handle a slash occurring in an anchored query (e.g., ^foo/bar) since that wouldn’t get matched when only looking at segments individually (root, foo, bar, and baz of a matching path /root/foo/bar/baz/).

Working through this took several hours: first throwing around design ideas with an LLM, having it write code to wrap Nucleo’s types, then realizing its code was bloated and didn’t spark joy, so finally writing my own (smaller) wrapper.

Then, after a break, I realized:

  1. I can’t think of a situation where I’d ever wished Finda had anchor functionality
  2. In a corpus of paths, I can anchor by just adding / to the start or end of a query (this works for everything except anchoring to the end of a filename).

So I tossed all of the anchoring code.

I’m pretty sure I still came out ahead compared to if I’d tried to write everything myself sans LLM or discussion with others, but I’m not certain.

Perhaps there’s some kind of conservation law here: Any increases in programming speed will be offset by a corresponding increase in unnecessary features, rabbit holes, and diversions.

Speaking of unnecessary diversions, let me tell you everything I’ve learned about structural diffing recently — if you have thoughts/feelings/references in this space, I’d love to hear about ‘em!

When we’re talking about code, a “diff” usually means a summary of the line-by-line changes between two versions of a file. This might be rendered as a “unified” view, where changed lines are prefixed with + or - to indicate whether they’re additions or deletions. For example:

We’ve removed coffee and added apple.

The same diff might also be rendered in a side-by-side view, which can be easier to read when there are more complex changes:

The problem with these line-by-line diffs is that they’re not aware of higher-level structure like functions, types, etc. — if some braces match up somehow between versions, they might not be shown at all, even if the braces “belong” to different functions.

There’s a wonderful tool, difftastic, which tries to address this by calculating diffs using treesitter-provided concrete syntax trees. It’s a huge improvement over line-based diffs, but unfortunately it doesn’t always do a great job matching entities between versions.

Here’s the diff that motivated this entire foray:

Note that it doesn’t match up struct PendingClick, it shows it deleted on the left and added on the right.

I haven’t dug into why difftastic fails to match here, but I do feel like it’s wrong — even if the overall diff would be longer, I’d still rather see PendingClickRequest and PendingClick matched up between both sides.

Here’s a summary of tools / references in the space:

  • The most “baked” and thoughtful semantic diff tool I found is, perhaps unsurprisingly, semanticdiff.com, a small German company with a free VSCode plugin and web app that shows diffs for github PRs. Unfortunately they don’t have any code libraries I can use as a foundation for the workflow I want.

    Context-sensitive keywords in particular were a constant source of annoyance. The grammar looks correct, but it will fail to parse because of the way the lexer works. You don’t want your tool to abort just because someone named their parameter “async”.

  • diffsitter

    • built on treesitter, has MCP server. README includes list of similar projects.
    • lots of github stars, but doesn’t seem particularly well-documented; I couldn’t find an explanation of how it works, but the difftastic wiki says it “runs longest-common-subsequence on the leaves of the tree”
  • gumtree

    • research / academic origin in 2014
    • requires Java, so no-go for my use case of a quick tool I can use via Emacs
  • mergiraf: treesitter-based merge-driver written in rust

    • very nice architecture overview; tool uses Gumtree algorithm
    • docs and adorable illustrations indicate this project was clearly written by a thoughtful human
    • semanticdiff.com author in HN comments: > GumTree is good at returning a result quickly, but there are quite a few cases where it always returned bad matches for us, no matter how many follow-up papers with improvements we tried to implement. In the end we switched over to a dijkstra based approach that tries to minimize the cost of the mapping
  • weave: also a treesitter-based merge-driver written in Rust

    • feels a bit “HN-optimized” (flashy landing pages, lots of github stars, MCP server, etc.)
    • I looked into their entity extraction crate, sem
    • core diffing code is OK but pretty wordy
    • greedy entity matching algorithm
    • data model can’t detect intra-file moves, even though those might be significant
    • includes a lot of heuristic “impact” analysis, which feels like overreaching-scope to me since it’d require much tighter language integration before I’d trust it
      • ran into buggy output when running sem diff --verbose HEAD~4; it showed lines as having changed that…didn’t change at all.
    • Too much 80%-done, hypothetically useful functionality for me to use as a foundation, but props for sure to the undergrad/student(?) who’s built all this in just three months.
  • diffast: tree edit-distance of ASTs based on an algorithm from a 2008 academic paper.

  • autochrome: Clojure-specific diffs based on dynamic programming

    • excellent visual explanation and example walkthrough
  • Tristan Hume has a great article on Designing a Tree Diff Algorithm Using Dynamic Programming and A*

My primary use case is reviewing LLM output turn-by-turn — I’m very much in-the-loop, and I’m not letting my agent (or dozens of them, lol) run wild generating 10k+ lines of code at a time.

Rather, I give an agent a scoped task, then come back in a few minutes and want to see an overview of what it did and then either revise/tweak it manually in Emacs or throw the whole thing out and try again (or just write it myself).

The workflow I want, then, is to

  • see a high-level overview of the diff: what entities (types/functions/methods) were added/removed/changed?
  • quickly see textual diffs on an entity-by-entity basis (“expanding” parts of the above summary)
  • quickly edit any changes, without having to navigate elsewhere (i.e., do it inline, rather than having to switch from “diff” to “file)

Basically, I want something like Magit’s workflow for reviewing and staging changes, but on an entity level rather than file/line level.

In light of the "minimal scope, just get your project done” lesson I’ve just re-learned for the nth time, my plan is to:

  • throw together my own treesitter-based entity extraction framework (just Rust for now)
  • do some simple greedy matching for now
  • render the diff to the command line

Once that seems reasonable (i.e., it does a better job than difftastic did on that specific commit), I’ll:

  • wire into a more interactive Magit-like Emacs workflow (maybe I can reuse Magit itself!?!)
  • add support for new languages, as I need them
  • potentially explore more sophisticated score-based global matching rather than simple greedy matching

Mayyybe if I’m happy with it I’ll end up releasing something. But I’m not trying to collect Github stars or HN karma, so I might just happily use it in the privacy of my own home without trying to “commercialize it”.

After all, sometimes I just want a shelf.

联系我们 contact @ memedata.com