代码重复远比错误的抽象代价低廉。
Prefer duplication over the wrong abstraction (2016)

原始链接: https://sandimetz.com/blog/2016/1/20/the-wrong-abstraction

在这篇关于软件设计的经典反思中,Sandi Metz 指出:“重复远比错误的抽象廉价。” “错误抽象”的陷阱在于,开发者为了将稍有不同的新需求强行塞入现有的共享方法中,往往会使用大量的参数和条件判断。久而久之,代码会变得脆弱且难以理解。受“沉没成本谬误”的影响——即认为代码既然花费了大量心血就必须保留——程序员往往会持续修补这些抽象,导致系统维护难度日益增加。 Metz 提出了一个大胆的解决方案:当抽象失败时,最快的解法是“走回头路”。通过将抽象代码内联(inline)回各个调用方来重现重复,然后剔除掉不必要的逻辑。通过移除错误的抽象,你可以明确各个调用方的独特需求,这往往能为你后续寻找更好、更自然的抽象路径提供指引。 归根结底,代码应服务于当前的需求,而非过去的决策。如果你发现自己正在通过增加条件判断来过度复杂化一个共享方法,那就放弃这个抽象吧。后退一步以消除技术债并非退缩,而是实现进步最有效的方式。

关于代码重复是否优于“错误的抽象”,在开发者之间一直是一个颇具争议的话题。这一理念的支持者认为,过早或不正确的抽象会创造出复杂且僵化的系统,维护难度远高于简单的重复代码。他们主张,重复代码能带来灵活性,且在清晰的模式显现后再进行重构会更加安全。 然而,许多评论者对此持反对意见,指出这种“廉价”的方法会产生严重的长期技术债。批评者认为,在规模化时未能进行抽象会耗尽团队精力,并导致维护上的灾难,尤其是当依赖于难以处理不一致、重复逻辑的大语言模型(LLM)时。另一些人则指出,现代函数式编程或严格的数据建模往往可以完全避免有害抽象的需求。 归根结底,共识在于“正确”的方法高度依赖于具体情境。虽然过度设计被广泛视为一个陷阱,但“设计不足”同样危险。最成功的策略似乎是那些推崇简洁性和可逆性的策略,即仅在模式真正确立时才合并代码,而不是将抽象强加给尚未完全理解的问题。
相关文章

原文

I originally wrote the following for my Chainline Newsletter, but I continue to get tweets about this idea, so I'm re-publishing the article here on my blog. This version has been lightly edited.


I've been thinking about the consequences of the "wrong abstraction." My RailsConf 2014 "all the little things" talk included a section where I asserted:

duplication is far cheaper than the wrong abstraction

And in the summary, I went on to advise:

prefer duplication over the wrong abstraction

This small section of a much bigger talk invoked a surprisingly strong reaction. A few folks suggested that I had lost my mind, but many more expressed sentiments along the lines of:

The strength of the reaction made me realize just how widespread and intractable the "wrong abstraction" problem is. I started asking questions and came to see the following pattern:

  1. Programmer A sees duplication.

  2. Programmer A extracts duplication and gives it a name.

    This creates a new abstraction. It could be a new method, or perhaps even a new class.

  3. Programmer A replaces the duplication with the new abstraction.

    Ah, the code is perfect. Programmer A trots happily away.

  4. Time passes.

  5. A new requirement appears for which the current abstraction is almost perfect.

  6. Programmer B gets tasked to implement this requirement.

    Programmer B feels honor-bound to retain the existing abstraction, but since isn't exactly the same for every case, they alter the code to take a parameter, and then add logic to conditionally do the right thing based on the value of that parameter.

    What was once a universal abstraction now behaves differently for different cases.

  7. Another new requirement arrives.
    Programmer X.
    Another additional parameter.
    Another new conditional.
    Loop until code becomes incomprehensible.

  8. You appear in the story about here, and your life takes a dramatic turn for the worse.

Existing code exerts a powerful influence. Its very presence argues that it is both correct and necessary. We know that code represents effort expended, and we are very motivated to preserve the value of this effort. And, unfortunately, the sad truth is that the more complicated and incomprehensible the code, i.e. the deeper the investment in creating it, the more we feel pressure to retain it (the "sunk cost fallacy"). It's as if our unconscious tell us "Goodness, that's so confusing, it must have taken ages to get right. Surely it's really, really important. It would be a sin to let all that effort go to waste."

When you appear in this story in step 8 above, this pressure may compel you to proceed forward, that is, to implement the new requirement by changing the existing code. Attempting to do so, however, is brutal. The code no longer represents a single, common abstraction, but has instead become a condition-laden procedure which interleaves a number of vaguely associated ideas. It is hard to understand and easy to break.

If you find yourself in this situation, resist being driven by sunk costs. When dealing with the wrong abstraction, the fastest way forward is back. Do the following:

  1. Re-introduce duplication by inlining the abstracted code back into every caller.
  2. Within each caller, use the parameters being passed to determine the subset of the inlined code that this specific caller executes.
  3. Delete the bits that aren't needed for this particular caller.

This removes both the abstraction and the conditionals, and reduces each caller to only the code it needs. When you rewind decisions in this way, it's common to find that although each caller ostensibly invoked a shared abstraction, the code they were running was fairly unique. Once you completely remove the old abstraction you can start anew, re-isolating duplication and re-extracting abstractions.

I've seen problems where folks were trying valiantly to move forward with the wrong abstraction, but having very little success. Adding new features was incredibly hard, and each success further complicated the code, which made adding the next feature even harder. When they altered their point of view from "I must preserve our investment in this code" to "This code made sense for a while, but perhaps we've learned all we can from it," and gave themselves permission to re-think their abstractions in light of current requirements, everything got easier. Once they inlined the code, the path forward became obvious, and adding new features become faster and easier.

The moral of this story? Don't get trapped by the sunk cost fallacy. If you find yourself passing parameters and adding conditional paths through shared code, the abstraction is incorrect. It may have been right to begin with, but that day has passed. Once an abstraction is proved wrong the best strategy is to re-introduce duplication and let it show you what's right. Although it occasionally makes sense to accumulate a few conditionals to gain insight into what's going on, you'll suffer less pain if you abandon the wrong abstraction sooner rather than later.

When the abstraction is wrong, the fastest way forward is back. This is not retreat, it's advance in a better direction. Do it. You'll improve your own life, and the lives of all who follow.

News: 99 Bottles of OOP in JS, PHP, and Ruby!

The 2nd Edition of 99 Bottles of OOP has been released!

The 2nd Edition contains 3 new chapters and is about 50% longer than the 1st. Also, because 99 Bottles of OOP is about object-oriented design in general rather than any specific language, this time around we created separate books that are technically identical, but use different programming languages for the examples.

99 Bottles of OOP is currently available in Ruby, JavaScript, and PHP versions, and beer and milk beverages. It's delivered in epub, kepub, mobi and pdf formats. This results in six different books and (3x2x4) 24 possible downloads; all unique, yet still the same. One purchase gives you rights to download any or all.

联系我们 contact @ memedata.com