(评论)
(comments)

原始链接: https://news.ycombinator.com/item?id=43166362

第二版的 *代码完成 *因几种过时且有问题的方法而受到批评。关键问题包括:对施工类比的过度依赖,对开源软件和来源控制的完全忽视,以及具有严格劳动力部门的准潮汐发展模型。该书倡导详细的前期设计,忽略了迭代用户界面设计和自动测试。作者还对以后阶段解决缺陷的成本提出了可疑的主张。 此外,本书错误地认为所有软件开发都是面向对象的,并且误解了核心概念。最后,作者赞扬了无用的IEEE管理流程标准。虽然 *代码完成 *并非完全没有功绩,但其缺陷和冗长使其不如其他现代资源。

相关文章
  • (评论) 2024-08-30
  • 清洁代码与软件设计理念 2025-02-26
  • (评论) 2024-09-09
  • (评论) 2024-08-09
  • (评论) 2024-09-17

  • 原文




















































































































































































































































































































































































































































































































































































































































    I read some of the more questionable parts of the second edition just now, and while I think there are a lot of things to criticize in it, I suspect that it's just rose-tinted hindsight, or ignorance, that made me not object to them in the first edition; I'm pretty sure most of these problems were already there:

    - All the time wasted on the dumb "construction" analogy.

    - The total lack of attention to open-source software, possibly because he was misled by his own "construction" analogy. There's no such thing as a freely redistributable cabinet that's extra reliable because your house shares it with the local nuclear reactor, or a rotten floor joist you can't fix without negotiating a source license. (He does discuss buying libraries, just as you can buy cabinets instead of building them.) Though this was surely also lacking in the first edition, it was a more forgivable oversight in 01994.

    - Very little attention given to automated testing; we don't get to "developer testing" until chapter 22, and even that's mostly about manual testing, though there are a few offhand remarks in §4.4 and §9.4 about unit testing and test-first programming, with no explanation of what that means. Even when he tries to explain "test-first programming" in §22.2, there's no hint that we're talking about automated testing. And we finally see mentions of "test code" and "JUnit" in §22.4, and then §22.5 and §22.6 have information about actual automated testing, though without any actual test code. The advice on test-case design is still excellent.

    - Also, very little about source control. I know this was a deficiency in the first edition because I remember the revelation of learning about RCS a couple of years after I read it. I think it actually got better in the second edition; there's a little "Version Control" section in §30.2 which refers you to §28.2, "Configuration Management", which talks a little bit about the problem but doesn't mention Subversion (first released October 02000), CVS (01990), RCS (01982), SCCS (01973), or even Visual SourceSafe (01994). Instead, it mostly describes implementing similar processes manually, through bureaucracy, because this is the "Managing Construction" chapter. But there is technically half a page on p. 668 singing the praises of version-control software, calling it "indispensable on team projects", which manages to not mention a single program you could use for it. An understandable oversight in 01994, unforgivable in 02004, but again, incompetence rather than snake oil.

    - Although he pays lip service at the beginning of the book to the independence of project phases and project activities, he often conflates them later, often presuming a quasi-waterfall model (when he isn't outright advocating it), where a requirements-analysis phase is followed by an architecture phase, then a detailed design phase, then a "construction" phase, then a testing phase, and then finally a maintenance phase. This is obviously completely unlike the reality of projects like Microsoft Windows, Emacs, Linux, GCC, and Facebook. When did Facebook mostly move from detailed design to construction? Would it have been a better social-networking website if it had spent a year or two on architecture before beginning "construction"? He does kind of go back and forth on this a lot, though, sometimes advocating more incremental approaches and then contradicting himself a page later.

    - Relatedly, he advocates a division of labor where "the architect consumes the requirements; the designer consumes the architecture; and the coder consumes the design." (Traditionally, though he doesn't say this, the QA tester then consumes the code.) This division of labor has been tried many times, and the companies that have tried it have been mostly outcompeted by companies with less dysfunctional divisions of labor; they mostly survive only in niches where they have legally enforceable monopolies, such as DoD cost-plus prime contractors. None of them have been able to produce products of quality comparable to things like Linux, GCC, and Facebook. I think this is the snake-oiliest part of the book.

    - Code Complete's table of "Average Cost of Fixing Defects Based on When They're Introduced and Detected", table 3-1, is convincing, compelling, thoroughly footnoted with decades of literature, and completely made up. See https://softwareengineering.stackexchange.com/questions/1637... https://web.archive.org/web/20121101231451/http://blog.secur... https://www.lesswrong.com/posts/4ACmfJkXQxkYacdLt/diseased-d... https://gist.github.com/Morendil/258a523726f187334168f11fc83.... This made-up data is McConnell's major justification for advocating waterfall-like models. More recent research that investigates the question empirically instead of relying on made-up hearsay finds, by contrast, "We found no evidence for the delayed issue effect; i.e., the effort to resolve issues in a later phase was not consistently or substantially greater than when issues were resolved soon after their introduction." https://arxiv.org/pdf/1609.04886 https://agilemodeling.com/essays/costofchange.htm https://buttondown.com/hillelwayne/archive/i-ing-hate-scienc....

    - The section about "user interface design" is cringe-inducingly bad. He thinks you can design a good user interface up front without having working software ("The user interface is often specified at requirements time. If it isn't, it should be specified in the software architecture,") rather than incrementally responding to usability feedback from people using a working system. It's a very short section, and that in itself is eyebrow-raising; usability is a central concern of most kinds of software, and one of the most challenging aspects of software. Really, almost everything in most software should be driven ultimately by user experience and grounded out in usability testing. Games, websites, browsers, and even compilers live and die on usability. But McConnell treats it as one minor detail among many.

    - The section about the "architecture prerequisite" sounds like it was written by IBM mainframe programmers in 01978, then decorated with some OO and WWW jargon. Yes, clearly the architecture should "describe the major files and table designs to be used". That makes sense. Yes, "Input/output (...) is another area that deserves attention in the architecture. The architecture should specify a read-ahead, read-behind, or just-in-time reading scheme." I mean, seriously? Note that words like "client", "server", "tier", "cache", "protocol", "network", "message", "queue", and even "process" (as in a running instance of a program) are completely missing here. It's not that he uses different terms for them; he just doesn't talk about them at all, using any words.

    - He tries to discuss "fault tolerance" with a totally nonsensical example "the square root of a number", necessarily making complete hash of the topic as a result. He doesn't mention any of the techniques that actually work for achieving fault-tolerance, such as statelessness, idempotence, the end-to-end principle, transactions, journaling, fail-stopness, checksums, disk mirroring, hardware trimodular redundancy, watchdog timers, ECC, anomaly detection, network timeouts, monitoring, alarms, etc. The only exception is that he sort of mentions granular restarts. I'm restricting myself to techniques that were well-known when he wrote the first edition of the book here, excluding things like Paxos, eventual consistency, and Merkle graphs.

    - There are a lot of cases where he repeats something he's heard that he evidently doesn't understand. The muddled attempt to explain fault tolerance above is one example, but we could also mention, for example, his attempt in §4.1 to describe Fortran programmers writing Fortran in C++, which completely misses the actual major difficulty (it's mostly about structuring the data as arrays, not the control flow), or his remark, "Assembler is regarded as the second-generation language," devoid of the historical context to provide any meaning to it.

    - One problem that I think is actually new in the second edition is its presumption that all software is object-oriented (despite paying lip service to the fact that Visual Basic [6] was the most popular language among professional programmers, many people were still programming in Ada and assembly and Cobol and C and Fortran, etc.) and that looks a bit snake-oilier from our perspective now than it did at the time. I think OO is a useful approach to software design, but if I'm writing a generic tutorial on how to design a program, I wouldn't have a step in it called "Level 3: Division into Classes" as McConnell does in §5.2, because that makes my book completely inapplicable to programming in C, Go, Fortran, Rust, Racket, VB6, or Clojure, and inapplicable to much of what people do in Python, PHP, JS, Octave, and R. The snake oil here is not object-orientation but a totalizing ideology that everything must be OO; the Chapter 6 introduction says, "In the twenty-first century, programmers think about programming in terms of classes.". The way I remember it, the first edition didn't have this problem.

    - This totalizing OO outlook is somewhat exacerbated by the fact that he doesn't really understand object orientation at all, so he gives a lot of bad advice, like, "A large percentage of routines in object-oriented programs will be accessor routines, which will be very short," §7.4. His whole chapter 6 is about designing classes, but he never mentions the actual core concept of object-orientation, which is polymorphic message sends, presumably because although he knows they exist, he isn't really comfortable with them and doesn't understand how central they are to the OO worldview. Instead he treats classes as a newfangled synonym for CLU's "clusters" or Ada "packages". Much of the chapter is devoted to workarounds for shortcomings of C++. This isn't really "snake oil," just incompetence.

    - Another problem that I'm pretty sure wasn't present in the first edition is the counterproductive recommendations of worthless IEEE management process standards in virtually every chapter. This is kind of snake-oily; these standards are of abominable literary quality and contain no useful information that could conceivably help anyone to improve the software they write. As a representative emetic example, check out IEEE 1028, recommended in Chapter 20 and again in Chapter 21. http://profs.etsmtl.ca/claporte/english/enseignement/cmu_sqa.... Unlike some other IEEE management process standards I've had the misfortune of reading, it at least doesn't seem to contain any misinformation, but that's because it manages to spend 47 pages saying nothing at all about software.

    There's still much material in the book that's solid, and lots of references to good information, but it's mixed with a lot of serious misinformation, misleading analogies, and embarrassing incompetence. And it's a slog to get through so much verbiage. But it's certainly better than Clean Code. Still, now that The Practice of Programming, The Pragmatic Programmer, and A Philosophy of Software Design are out, I think there's no longer any reason to recommend Code Complete.





















    I don't think it's fair to dismiss criticism because of skill. I said I think it's a bad book, and my most disliked, but it's not worthless, and if it's all someone has, they can indeed learn things from it even if they won't learn very much per page. However, there are many other books available, and by my own opinion all of them that I've read are superior. Any value you'd get from APOSD, you'd get from any book aimed at or inclusive of a similar audience, and the other book would give even more value that's absent from APOSD. (As another example, I was introduced to The Pragmatic Programmer in college. I believe it can serve the role of APOSD just fine but I never liked it enough to finish it, so perhaps I'd rank it lower if I did.) I also think you'd get most of the value just by writing more programs.

    Anyway, the favored book I did highlight, The Practice of Programming, shares some things with APOSD: it's also not academic, is also quite short (maybe 70 pages longer), and is also more productively read earlier in one's career or study but it's still appreciable by those with more experience. You'll learn things about design. But it has so much more than APOSD: you'll learn things about implementation and debugging and considerations for libraries for yourself or others rather than just applications, and so much more in so few pages; just lots of things central to writing programs, which is the fundamental task at the end of the day, more so than just "designing" things.

    I guess another complaint is that APOSD just doesn't have enough code in it. And perhaps an implicit philosophy I have is that you can't actually master good design without writing good code. Learning from the feet of masters is a good way to learn, but they actually have to teach by example. To that end, The Practice of Programming has many programs as examples (like a markov chain text generator, written in multiple languages with performance and effort-of-writing comparisons) and invites the reader to do many various exercises (like commenting on comments, or rewriting part of an example to use a different implementation decision and compare the different approaches).

    When that book happens to make a claim I agree with, I don't tend to also just dismiss it as a platitude, because it's better argued and reasoned (or argued and reasoned at all), and supported and contains even more information to consider. Let's expand the bit I quoted about interfaces from APOSD, it's actually from the section on exceptions.

    "Defining away exceptions, or masking them inside a module, only makes sense if the exception information isn't needed outside the module. ... However, it is possible to take this idea too far. In a module for network communi­cation, a student team masked all network exceptions: if a network error occurred, the module caught it, discarded it, and continued as if there were no problem. This meant that applications using the module had no way to find out if messages were lost or a peer server failed; without this information, it was impossible to build robust applica­tions. In this case, it is essential for the module to expose the exceptions, even though they add complexity to the module's interface. With exceptions, as with many other areas in software design, you must determine what is important and what is not important. Things that are not important should be hidden, and the more of them the better. But when something is important, it must be exposed (Chapter 21 will discuss this topic in more detail)."

    I find the student example here pretty weak, but it'd be stronger if the actual code was shown and developed, especially if done in a context where it's understandable how the students might have thought it was a good idea at first, rather than just making an obvious mistake because they're students. Chapter 21 does discuss things in more detail, but not much more, and again there are no code examples much beyond pointing back to a prior chapter's dozen lines of strawman Java. It starts off with:

    "One of the most important elements of good software design is separating what matters from what doesn't matter. Structure software systems around the things that matter. For the things that don't matter as much, try to minimize their impact on the rest of the system. Things that matter should be emphasized and made more obvious; things that don't matter should be hidden as much as possible."

    Does that not read to you as terribly verbose and information sparse? Capable of eliciting a "duuuuuh" even from a beginner programmer? Almost tautological even? The rest of the chapter is similar and doesn't actually give much more information at all. Sure there are a few tidbits of use in there, like the idea of "leverage" and what that means as an approach, and a throw-away line that deserved more elaboration about shallow classes needlessly increasing what seems "important". (Yegge's "Execution in the Kingdom of Nouns" post is a good expansion of that and other things, if it's at all helpful to understand examples of what I find valuable in comparison to this book.)

    Let's compare now some similar bits from The Practice of Programming. This comes as a partial summary after a worked section on designing an interface for parsing CSV files in C and C++ with many design decisions detailed and discussed.

    "Good interfaces follow a set of principles. These are not independent or even consistent, but they help us describe what happens across the boundary between two pieces of software. *Hide implementation details.* The implementation behind the interface should be hidden from the rest of the program so it can be changed without affecting or breaking anything. There are several terms for this kind of organizing principle; information hiding, encapsulation, abstraction, modularization, and the like all refer to related ideas. An interface should hide details of the implementation that are irrelevant to the client (user) of the interface. Details that are invisible can be changed without affecting the client, perhaps to extend the interface, make it more efficient, or even replace its implementation altogether. The basic libraries of most programming languages provide familiar examples, though not always especially well-designed ones. The C standard I/O library is among the best known: a couple of dozen functions that open, close, read, write, and otherwise manipulate files. The implementation of file I/O is hidden behind a data type FILE*, whose properties one might be able to see (because they are often spelled out in ) but should not exploit."

    If you squint, kind of says much the same thing, right? But it's richer, includes whys, and points to a real-life example, not a student project. It also criticizes the C I/O library right after because of its exposure of publicly visible data.

    More on the topic of exceptions, the book takes a rather classic approach that I don't fully endorse ("Use exceptions only for exceptional situations"), but one unique bit is a more thorough treatment of handling errors without having to alter control flow, and why that might be important. In the markov generator program, one worry is that there might not be enough input to start the algorithm. One could exit prematurely (with a special value or an exception) but the book chooses instead to do some padding to ensure the problem goes away. Emphasis mine:

    "Adding a few NONWORDs to the ends of the data simplifies the main processing loops of the program significantly; it is an example of the technique of adding sentinel values to mark boundaries. As a rule, try to handle irregularities and exceptions and special cases in data. Code is harder to get right so the control flow should be as simple and regular as possible."

    You don't have to take this rule as given, you immediately see it in action, and an exercise later invites you to re-implement without a sentinel value to compare.

    APOSD has an entire chapter on errors, but this idea is only barely hinted at in the whole chapter on errors with the idea of defining errors out of existence (it uses a more controversial example, I think, from TCL) and this bit that clarifies that by "exception" he doesn't necessarily mean a stack-unwinding thing: "However, exceptions can occur even without using a formal exception reporting mechanism, such as when a method returns a special value indicating that it didn't complete its normal behavior. All of these forms of exceptions contribute to complexity."

    It's just such a shallow treatment, and I think that last bit is more focused on the other basic idea that Practice of Programming spells out:

    "Exceptions should not be used for handling expected return values. Reading from a file will eventually produce an end of file; this should be handled with a return value, not by an exception."

    That's followed by a code example showcasing said behavior that doubles as a less-strawman swipe at classical Java. (The Java code loops in.read() until it's -1, and has separate exception handlers for a file not found exception, which the book thinks isn't all that exceptional, and a generic IOException.) But to APOSD, it doesn't seem to matter, they all just contribute to complexity. Maybe they contribute to different degrees? (This would require an objective definition of complexity that lets you count the twists, though.) Maybe leveraging the type system (if you have such a language) to define away errors should be mentioned? Maybe (though this one is truly a rhetorical fever dream wish) acknowledgement of Common Lisp's condition system as yet another powerful alternative should be given?

































































































    联系我们 contact @ memedata.com