Is Size Code's Worst Enemy?

Oct 29 2011

The Drupal codebase upon which I work is now over a million lines of code (excluding whitespace and comments). It sounds impressive. But the reality of the matter is that the combination of lots of code and the Drupal way of doing things makes it not impressive, but a maintenance nightmare. Nobody on the current team knows what all of this code does or what it is for. Even limiting things to the custom modules, there still is no longer any member of the team who knows the code well. This, of course, isn't a criticism of the team or even of the platform, but a reflection on what happens when a codebase balloons over the years.

Reading Steve Yegge's post entitled Code's Worst Enemy hit home the concern I have with our code -- and with Drupal in general. (Update 10/29/2011: Steve Yegge's the guy who accidentally posted the Amazon/Services rant on Google+, and who unintentionally "quit his job" in the middle of his presentation at OSCON.)

I suggest reading the entire blog post on its own, but here are several salient details that need explicit mention, and that have a Drupal context:

  • While some languages (Java) may exacerbate the problem, clearly ballooning code can happen in any language. And with a semi-opaque execution sequence (as we have in Drupal), the problem can be compounded by the fact that one cannot determine at a glance what code might be executed on a given execution. To know what code will be executed on a given request, you must know not just core and your own modules, but all of the installed modules.
  • Design Patterns might deserve a measure of skepticism. Steve's point is that relying upon them can introduce needless complexity. He uses Dependency Injection as an example. Too often, design patterns are introduced for their own sake or because they look similar to what we want to accomplish. But then the need to (re-)architect in terms of the pattern sometimes overshadows the original goal of accomplishing a task.
  • Copy-and-Paste (CAP) code is bad. Obviously. But because all of Drupal is a public API, I often see developers choosing to CAP code from function body to function body because they think that is more elegant than providing highly-contextual stand-alone functions that might be mistaken by other developers as "generally useful". (No, prefixing functions with underscores is NOT a good alternative. Lately, I've been encouraging developers to underscore all functions that aren't hooks or constructed callbacks because it's too easy to get hook/namespace collisions otherwise.)
  • Unfortunately, Steve doesn't talk about YAGNI ("You Ain't Gonna Need It") as a good design principle, but the converse of YAGNI -- that tendency to attempt to solve all possible cases before there are any actual cases -- is a dangerous tendency in software developerment that must be countered in the name of simplicity and maintainability. <!--break--> (This post was written in July, 2011.)