Writing Maintainable Code

Writing maintainable code is hard. It must be understandable, testable and readable. Any one of these can be tricky, and together they seem pretty daunting. Thankfully, Michael Williamson makes it look easy to become a code craftsman.

It’s an inescapable fact that writing maintainable code is hard. Most projects end up with code that nobody dares touch, caused by a lack of understanding and a fear of error. Before we think about how to avoid such a mess, we have to consider what we mean by maintainable code.

Before we can change any code, we have to understand what the code currently does, and whether it’s supposed to do that. While there are many ways to achieve this, I think one of the best solutions is a comprehensive suite of automated tests. A good suite of tests can serve as documentation, telling you how the code is supposed to behave, as well as making sure that the code actually has that behaviour. Even better, they can give you the confidence that your code still works after you’ve made your changes. Unfortunately, some code is impossible to test, so the first question I’ll try to answer is: how do I write testable code?

Even if we have a suite of tests, changing code is difficult if we have no idea how it works. Therefore, the second question is: how do I write clean, understandable code?

How do I write testable code?

If you ask people, “How do you write untestable code?”, they’ll often respond with answers like “Long, complicated methods” or “No clean separation of concerns”. While these are things to avoid (and we’ll come back to them), they make code hard to test, but not impossible to test. Code tends to be untestable for two main reasons:

  • Global, mutable state. In other words, a variable that any code can access, and any code can modify.
  • Constructing, or otherwise acquiring, services, such as database connections, through static methods (a constructor is effectively a static method).

Global variables are fine if it’s immutable data. If it’s mutable, then it becomes impossible to consistently set up an identical environment before each test. For instance, say a method uses an incrementing integer to assign unique IDs. Each time we run the method, the assigned ID will be different, making it impossible to test the method with particular IDs.

The solution to the second point is dependency injection. A class should ask for its dependencies through its constructor rather than acquiring them itself. This allows a test to pass in mocked versions of those dependencies. For instance, let’s say we have an online booking system for a tattoo parlour. If a user under 18 requests an appointment, we refuse to give them a tattoo. If they’re 18 or over, we give them the first available appointment. The code might look something like this:

There’s no way to test this class in a unit test – no matter what we do, it will construct an AppointmentBooker, which will connect to the database. We don’t want to have to set up an entire database just to test the logic in this method. Instead, we should ask for an AppointmentBooker in the constructor.

Now we can pass in a mocked instance of IAppointmentBooker in our test. For instance, just taking the case of the user being underage:

There are some frameworks, such as Google Guice for Java or Ninject for C#, that can automate some parts of dependency injection, saving you from death by a thousand news.

How do I write clean, understandable code?

Although there are dozens of useful rules and principles in writing clean code, I think most can be reduced to one of these three:

  • Optimise for the reader of the code, not the writer. Code is read more than it’s written. If you optimise for the writer, you’ll save a few key presses, but the cost to the reader is confusion, frustration, and subtle (and not-so-subtle) bugs.
  • Don’t repeat yourself (often abbreviated to DRY). It’s easier to maintain code if it only exists in one place, and this also ensures consistency. If code is duplicated, there’s a good chance that you’ll forget to update one of the copies, meaning bugs you fix in one copy will still be there in the other copy.
  • Do the simplest, smallest thing you can do to add some value. If you try and guess how you should try and modify the system for all future requirements, you’re going to get it wrong. Either you’ll end up rewriting the code you never used, or you’ll be stuck with code that sort-of does what you actually want it to do.

Optimising for the reader

The computer doesn’t care how you write your code, so long as it’s unambiguous. In other words, write code for humans, not machines. There are plenty of principles you could use, this is just a smattering:

  • Spend some time carefully thinking about the names for classes and methods. If you find all of your class names end in Helper, then you might need to spend a bit more time thinking. Thinking up descriptive names is hard, but being unable to think up a good name is sometimes a sign that your class or method does too much, and actually needs splitting up.
  • Don’t use abbreviations. For instance, instead of calling a variable dbc, call it databaseConnection. The exception is when the abbreviation is well-known, for instance, using html as an abbreviation is fine.
  • Good code should be unsurprising. If you find some code that is surprising or “clever”, try to see if you can simplify it, or somehow make it clearer.
  • Each method should operate at one level of abstraction. A method that deals with high-level concepts in your domain shouldn’t be doing complicated string manipulation as well. One style of writing code is to make it read like a newspaper article. From reading that method, you get just enough detail to see how it works. If you want more detail on how it works, you can dive into one of the methods being called. Just like a newspaper article, you should be able to stop reading at any point and have an understanding of the overall picture.

Don’t repeat yourself

If you ever find yourself copy-and-pasting code, then that’s a hint that you should pull out the common functionality of the code.

In some languages, it’s difficult to pull out code that has structural duplication, but operates on different data. However, if you have a language that lets you pass around functions easily, you can pull out that duplication. For instance, say you want to convert a list of strings to a list of integers. You could build a new list of integers, iterate through the list of strings, and add each parsed string to the new list:

Every time you want to convert one list of values to another, the code is identical except for the value conversion. You can pull that duplication out by using map, called Select in C#:

Do the simplest, smallest thing to add value

It’s impossible to guess what code you’re going to need in the future, even if you think you have a good grasp of the requirements. By taking small steps, you learn about the problem while solving it.

You might notice that always taking small, simple steps doesn’t always lead to readable code without duplication. It’s crucial to remember to refactor your code once it’s working. Once you have code that does what you want in the simplest way, you must then make sure it’s clean code that you’d be happy maintaining.

Example

Here’s some code that I reckon could do with a bit of refactoring:

There are clearly some problems with this code:

  • The class is called ArchiveController – what’s in this archive?
  • The variables aren’t named well, for instance, what does ahwpm mean?
  • The logic that reads start-date and end-date from the form is duplicated.
  • What does DateTime on a hat sale represent?
  • There are bits of NHibernate querying and LINQ all on one unreadable line.
  • The method is long, and you feel you have to read the whole thing to begin to understand what’s going on.
  • The method operates at many layers of abstraction – it’s parsing strings, accessing the database, and doing some calculations.

So, after some refactoring:

Although this code reads much better, it’s not perfect by any means. For instance, it doesn’t obey the Single Responsibility Principle [PDF]: it deals with everything from parsing HTTP parameters to querying the database. We’ve already pulled this logic into separate methods, so a next step might be to try and pull these methods into separate classes. Regardless of what change you’re making, you should be extremely cautious about changing any code without any tests around it.

Conclusion

I’ve talked about how to write maintainable code, but is it the case that we always want maintainable code? Are there not occasions where we need to write a quick piece of code that we’re going to throw away immediately? This logic has one major flaw: our inability to predict the future. What begins as throw-away code can quickly become a vital piece of infrastructure, needing maintenance and extension for years. Even if the code is to be thrown away, the benefits of writing maintainable code can pay off much quicker than you might think. Often, a quick piece of code isn’t so quick after all, and you can easily find yourself surrounded by a baffling labyrinth of code in under a day.

Finally, it’s important to have a sense of craftmanship. Through continuous practice we improve the quality of what we write, and form the habit of always writing clean, readable code.

Tags: , , , , ,

  • 39069 views

  • Rate
    [Total: 0    Average: 0/5]
  • Juampa

    Nice article!
    Concise article with nicely exemplified concepts
    Thank you for the good post Michael!

  • Jarrett

    Good start
    I like where you’re going. The clarity and readability is very nice. Just a few notes.

    (1) Is there a reason you’d make the average a nested class and not it’s own stand-alone class? There’s nothing about the summary class that makes it controller-specific.

    (2) Once the average is it’s own class, I’d probably build an extension method to convert from a collection of hat sales to the monthly average object.

  • Anonymous

    REVIEW
    rigid, dogmatic, text-book oriented

  • Anonymous

    Var
    Eliminating the unnecessary and excessive use of ‘var’ would be a nice step toward optimizing for the reader rather than the writer.

  • Anonymous

    Thanks for the Article
    “it’s important to have a sense of craftmanship. Through continuous practice we improve the quality of what we write, and form the habit of always writing clean, readable code.”

    I whole heartedly agree with that last sentence.

    Thumbs up for your article.

  • Anonymous

    Use statement numbers in comments
    I find it best for each line to be prefixed by a comment line which includes the statement number of the code statement.

  • Tim Daly

    Literate programming
    “Optimize for the reader”. When writing code we
    tend to focus on the “how” and when reading code
    we can see the “how”. What we can’t see is the
    “why”. I can see what the code does, I just have
    no idea why it needs to do that. This happens a
    lot with large legacy systems where the original
    authors have left the project.

    Knuth recognized this problem and suggested that
    programs should be written to be read like a novel,that is, they are written to be read by a
    human and only incidentally by the machine.
    In a novel, characters and actions are only
    introduced as needed and the actions all have a
    motivation. Similarly, in a literate program code
    is only introduced as needed and the code needs
    a paragraph explaining the motivation (the “why”)

    Literate software is useful for projects that are
    likely to last longer than the team that wrote
    the software. It is especially important if the
    software is vital to the organization.

  • Jeff Moden

    Without the “Why”, it doesn’t matter “How”
    First, I agree with everything Tim Daly posted above especially the part about the “Why” and the part about Knuth’s idea that code should be written for humans, not machines.

    I’ll likely be accused of being “old school” and I’ll likely be reprimanded by people who still think it takes too much time but the only way you can explain the “Why” of code is with comments in the code. No one should have to read code to figure out why the code is doing something. Since the “developer” writing the code knows the requirements when actually writing the code, that would be the time to add some comments to the code. The excuse of “it takes too long” is just an excuse because it only takes seconds to write down the thought driving the code.

    I am glad, however, to see that folks writing T-SQL aren’t the only ones that have to put up with people who refuse to document the “Why” in code. 😉

  • KMartin

    Coding the Story…
    Michael, thank you for the thoughtful and succinct article..
    @Jeff & @Tim: AMEN!
    @’REVIEW’: re-read definitions for ‘dogma’ & ‘rigid’ <hint: orient yourself toward ‘discipline’>

    There is no ‘old school’… is Shakespeare considered ‘old school’? I am in a position that I and my team must provide technical support for business critical applications, that have been in existence/use for over 10 years. We spend 80% of our time trying to reverse-engineer the ‘why’ & ‘how’ of our code base. I believe that there is no ‘legacy’ code… if I am supporting 10 year old code, then it is “today’s” code. Our customers would greatly benefit from easily maintainable & testable code (sorry for the -able’s)… and my company would directly reap those benefits with less support cases, more & longer product retention from our customers.

    Granted, code that is ‘barely’ maintainable keeps me employed, but it does not keep me happy…

    @Coders: Demand that you get ‘the Story’ before you start writing code!
    if( TheStoryIsComplete = true)
    {
    MyJobIsSweeter = true;
    }
    else
    {
    MaintenanceHell = true;
    }