Code Size is the Enemy?

“Size is the Enemy” is a Jeff Atwood (Coding Horror) article that has made the rounds lately. It’s a response to a rambling rant by Steve Yegge about how large code bases are unwieldy, and how language choice affects the viability of software projects beyond a certain size. Early on, Jeff calls out a quote that I, too, find really amusing:

I happen to hold a hard-won minority opinion about code bases. In particular I believe, quite staunchly I might add, that the worst thing that can happen to a code base is size.

I don’t think I’ve ever encountered a programmer who truly believed that “bigger is better” with regards to codebase size. Granted, the crowd I tend to run with isn’t exactly a bunch of Perl golfers either, but no reasonable person would doubt that complexity scales superlinearly with lines of code. So, at least with regards to the sentiment that codebase size should be minimized, I find myself in agreement with both — if you can reduce the size of your code while maintaining readability, then it’s an easy win. For example, on Neverwinter Nights 2, we drastically reduced the number of scripts required by adding a parameter passing mechanism to NWScript — I actually ran the numbers and discovered that 75% of the script calls in conversations used our new, parameterized scripts rather than the one-off style of scripts from NWN1. This is a great example of reducing code size without reducing functionality, and it made things much more manageable.

I take issue with the second conclusion of Yegge’s article, though. While it’s true that functional languages can often express algorithms more eloquently than imperative languages, I don’t think this magically translates into huge maintainability wins in a monster codebase. Sure, the trivial example of reading lines out of a file makes other languages look great in comparison to hoary old C++. (I’m not sure if I’m allowed to call Java “hoary old Java” yet — after all, it’s only 12 years old or so.) But to hold up this example as proof of language superiority is missing the bigger picture — what makes an application distinctive is not how it reads lines out of a text file, but rather all the other stuff it does that no other piece of software does. The design of that “stuff” probably has more to do with maintainability than language choice ever could.

The biggest factor in the complexity of a code base, in my opinion, is the complexity of its internal interfaces. I hate to trot out the quixotic concept of the “software IC”, but thinking about an interface in terms of building it onto a chip is a decent analogy in this case. Once you get beyond a certain number of “pins,” or command multiplexers, or what have you, things get complicated. When you have 500,000 lines of this kind of “complicated,” all interacting with each other in mysterious ways, you have big trouble.

Now, I’m not advocating going gonzo with componentization, either, in spite of how delicious ravioli code sounds. I’ve seen and heard about way too many over-engineered, CORBA-gone-wild projects to make any kind of blanket statements in support of component architectures and stuff like that. But by keeping interfaces (whether internal to the code, or part of a component system) minimalist in nature, and designing them so that doing the right thing is easy, and doing the wrong thing is hard, you can maintain a good level of understandability in a codebase. I see no reason why this advantage does not scale with project size. Dividing a codebase into easily testable, well-defined components with simple interfaces is key, particularly for lone-wolf developers.

I find it very interesting that the project in Yegge’s article (Wyvern) is not just a game, but a role-playing game. I know first-hand that RPG rules systems can, by their nature, necessitate a kind of code design that leads to massive complexity. RPGs tend to carry a massive amount of state around, and have rules systems that interact with that state in often arbitrary ways. (An example of this would be a character trait that changes the order of combat resolution, or some kind of “luck” trait that allows re-rolls of certain types of skill checks. If you have many rule-changers like this, it’s almost impossible to write a clean system to handle it.) This kind of complexity is design complexity, and has nothing to do with programming languages and the features they support. For this reason, I think that even if he succeeds in his goal of removing 33% of Wyvern’s lines of code, I don’t think that the resulting code base will be any easier to maintain. (His game is a 2D Java RPG — I don’t think that it’s a stretch to say that the majority of his code is going to be related to game rules and game content.)

I think that both authors are missing a key point — namely, that if you can simplify your application or problem, you should do so. Wyvern should look more to Magic: The Gathering’s rules than D&D. (Granted, Magic has its own mind-benders, but I think that it’s fundamentally simpler than even the newer versions of D&D. Coming from a math professor, that’s what I would expect.) Granted, sometimes it’s not possible to simplify a problem any further, but in this case Wyvern appears to be a self-inflicted wound. And, finally, I am still in agreement that code size reduction is a wonderful thing, but I am less enthused about the opinions rendered in the language wars…

Join the Conversation

2 Comments

Jeremy says:

December 26, 2007 at 8:37 pm

Yeah, I completely agree with you sentiment that things should be as simple as possible and no simpler, and that some things are just complex by nature.

Good design is a real trick – more an art than a science. Dividing up functionality into modules and arranging APIs to fulfill needs and make sense requires not only clearheaded thinking but sometimes the ability to see into the future.

Suggested practices for good design try to solve problems of human nature: people would rather code than plan ahead and people don’t like changing things they’ve already written. Traditional OO design requires the planning up front with less rework later while agile programming lets you do without the planning if you are willing to bravely rewrite huge swaths of code later.

Size isn’t always the dividing line between maintainable and non-maintainable. I’ve used large off-the-shelf products that are easy to reuse with clear APIs and ones that were hideous nightmares. A little design talent goes a long way I think.

Erik Novales says:

December 28, 2007 at 10:05 pm

I think one common failing in a lot of APIs is the failure of the designer to actually think about how it is going to be used — shocking, I know, but unfortunately way too common. One symptom of this is the “kitchen sink” API (in which the user is left to figure out which pieces go where in which order, at their peril). This is why I guess I associate massive APIs with poor design.

I think a couple of examples of great API design are the APIs for the Miles Sound System and Bink movie player. Miles provides a “quick” API, which is extremely simple and powerful enough for many games, as well as a more complicated API for games with demanding sound requirements. The quick API is a good tradeoff between ease-of-use and power, but it doesn’t limit the power of the library as a whole.

And as far as Bink goes, you can get basic functionality working in something like 10 lines of code, and doing more specialized stuff with movie playback isn’t much more difficult. For a library that is doing a lot of stuff under the hood, it’s amazingly easy to use.

Join the Conversation

Leave a comment

Leave a Reply to Erik Novales Cancel reply