Click here to monitor SSC

Software Engineer @alexdcode

Chunks of development work are like database transactions

Published 7 March 2013 5:47 pm

I noticed something recently, that I was using exactly the same logic in two seemingly unrelated aspects of my job. Database transactions and self-contained chunks of engineering work are exactly the same! Bear with me…

First, you probably already know what I mean by a database transaction, but just in case: When you want multiple changes to a database to all become visible to other users of the database at the same time, as if they happened atomically, you use a transaction. The database somehow remembers which changes form the transaction, and either writes them all at once at the commit, or pretends to other queries that they haven’t been written yet until the commit. Either way, the database has to remember everything about your transaction until it’s done, and has to do extra work to check the status of transactions every time other kinds of query are run.

So, self-contained chunks of development work: What I mean is a piece of work which makes the system better in a way all end users can see. Implementing a feature, fixing a bug, improving performance in one situation are all a chunk of work. Importantly, doing backend preparation for a later feature is not. Also making an experimental feature that only some users can see is not. During the course of work on one of these chunks, you need to use techniques that keep the unfinished code away from existing users. You could do this by just not releasing until you’re done, or using a flag so it’s only turned on for testers, or using a feature branch. Either way, the code that’s being tested by the most representative sample of the users (all of them) get more and more diverged from the latest code, the longer you take to write a feature.

Both database transactions and chunks of development have fixed costs. Each time, you need to get into the right context, and at the end of each you need to be vigilant that everything is in a consistent state and released to the world. When making a lot of small writes to a database, it’s a good idea to group them into transactions just to share those fixed costs, even if you don’t need the atomicity guarantee. Likewise, it would be wasteful to go through an expensive release process with every small bug you fix (although I hope your release process isn’t expensive).

However, both database transactions and chunks of development work have costs that increase the larger they are. And they don’t just increase linearly. A long database transaction imposes a growing cost on every other query being processed, and the longer it lasts, the more other queries that affects. Similarly, code that diverges from production imposes super-linear costs, for example the cost to fix a bug gets larger the longer ago it was introduced. Other development work has to deal with both possible code paths. Bugs have to be fixed in two places. Feature branches get merge conflicts.

Now we know about these costs, can we think up good tactics to reduce them? Of course! I’m sure your database is already well tuned to use transactions that are not too big, and not too small. A happy medium. The exact length of a good transaction varies depending on the system involved, but somewhere in the tens of ms sounds about right to me. Too many 10 second transactions will have a serious effect on the health of the database, but you can achieve much higher write throughput if you don’t do every tiny row update in its own transaction.

The same applies to chunks of development work, but from what I’ve seen of the industry, we find it really hard to tune our development tactics in the same way. If you’re not planning to get a feature to your users for three months, these costs are going to bite you. A good scrum team with a two week cadence are going to do much better, as long as they’re disciplined about choosing goals that are genuine user stories rather than preparation work. I prefer a day or two whenever possible, but it depends on how much a release costs for you.

Luckily, it’s pretty easy to break a big chunk of development work up into smaller user-visible pieces, and hopefully thinking about the analogy with database transactions will help explain the motivation for it.

 

Leave a Reply