Could it be that, if you adopt Test Driven Development, Continuous Integration and Continuous Deployment, then much of the bureaucracy of team-based software development becomes redundant? It is an intriguing idea which has led to a creative experiment.
A short while ago, a customer hit a problem with a website that we'd help develop. After five minutes debugging we'd found the problem, an obvious mistake on our part and a simple 1-line fix. With some basic testing, we were confident the fix was correct and would have no "knock-on" effect. However, it took a full week's work to get our fix through the various branching, building and testing processes that deemed it ready to deploy to the customer.
As the days passed without the customer getting the problem resolved, it was difficult not to succumb to a sense of mounting frustration. Of course, there are good and valid reasons why all these processes exist, and yet we seemed to have lost a sense of balance. Could we not just push the fix to the website, now?!
In the same week, a request came in for an update to our internal wiki. In less than 30 seconds, we'd made the change and it was live. We began to wonder if there was a good reason why, with the right testing regime in place, making a change to one of our applications shouldn't be as easy as updating a wiki.
The Balance Between Process and Bureaucracy
All good software houses have processes in place that ensure a certain level of quality and reliability in the software they deliver. We all understand the need for these processes and why pushing random changes straight to production is unacceptable. However, even in the best companies, it's inevitable that to a certain level of bureaucracy begins to weigh them down.
Imagine you want to deploy a change to your web application. With traditional Test-Driven Development (TDD), it's reasonable to expect to go through the following steps:
- Download your Version Control System (VCS), say, git
- Check out the repository for your project
- Download and install your IDE, say, Visual Studio
- Open up your project in Visual Studio
- Run your tests to make sure the code is working
- Write the failing test
- Run your tests and see them fail
- Write the production code
- Re-run the tests and continue editing production code until the tests go green
- Wait for the build server to create a deployable package
- Copy the package onto production
- Deploy the package
Many projects are even more complicated. For example, you might need to set up some dependencies before you can successfully build your project. You may need to configure your test environment. You may need to create a separate branch for your quick fix to avoid deploying other, half-complete features.
In the case of our web application, we could have updated the main trunk, in source control, tested it and waited for the next release of the website. However, this wasn't due out for several weeks, and we needed to push the change out faster, which meant creating a separate branch. This in turn meant time burnt wrestling the VCS and build system to create and build a branch from the current, live version of the website, and preparing my environment for deploying it. This was all work in addition to actually writing and testing the fix, the bit about which I cared.
Finally, after what felt like a long week, we could submit a "pull" request so that our CI process would build and deploy to the live site. While the whole process was sound, it seemed we'd lost sight of the original meaning of "Continuous Integration": everybody committing small changes to the same development branch, so all changes to the code are shared quickly.
WikiCode: Wikifying Software Development
The concept of a wiki is that it's incredibly easy for anyone to add new content or change existing content. It doesn't expect changes to be perfect (read, fully tested) before making them visible to others, but there are mechanisms in place to make it very easy to undo bad changes, and rollback to the previous state.
What were the essential differences between the 30-second deployment on the wiki, and the weeklong deployment to our website? How could we make the latter more like the former? During a recent Down Tools week at Red Gate, we began building a development environment based on the concept of wikifying team development, in other words, making it as easy for a team of developers to make changes to an application or website as it is to edit a wiki. We called this development environment WikiCode, and it replaces the previous 13-step deployment process with the following 4-steps:
- Open up the code editor in your browser
- Write the failing test
- Write the production code
- Hit Save
With WikiCode, all the development is in a browser so there's no need to set up your own build system. There's no need to remember, on coming in in the morning, pull the changes to your version and then rebuild, before doing any updates, and no need, having forgotten to do this, to create extra branches and merge them. It removes much of the "mental baggage" that tends to accompany everyday development.
As soon as a developer saves a failing test, WikiCode adds it to the test suite, which runs and produces the failure (or multiples failures if other tests in the suite also fail). The developer then writes production code and hits save, which reruns the test suite. For as long as the tests fail, a developer's code changes remain in a private working copy. However, as soon as the tests pass the private working copy becomes the public development copy, and the change is visible to others, and deployed to production.
In short, there is no need to install anything, no need to create branches, no need to run tests manually, no need to deploy manually. In fact, WikiCode actively avoids all of these things.
WikiCode adopts the principles of Test Driven Development (TDD), Continuous Integration (CI) and Continuous Deployment (CD). With strict application of these principles, much of what is possible with WikiCode you can also achieve in a traditional development environment. The big difference is that with WikiCode there's simply no alternative.
WikiCode Simplifies Version Control
In WikiCode, source control consists of only the main development trunk for the project; no branches, no merging. If code is in the main branch, it is live, and visible to others.
In our regular development system, we'd had to create a branch in order to make even a small change, because we had a raft of other changes in the main trunk that weren't yet fully tested or ready to ship. However, if we're in a situation where the only way to get code into the main trunk is if its passed all tests and is live, then suddenly we've lost the need for branches and have a rather different approach to version control.
You never have to think or worry about the version control system, it doesn't ask you any questions and there's just one way of using it.
WikiCode Enforces TDD
With WikiCode, the only acceptable way to deploy code is to write a failing test first. As soon as your code passes the tests, it is shared with others and deployed to production. If you don't write any tests at all, then WikiCode will immediately deploy the code you write, since the tests will never fail. However, at that point, you'll be deploying "works in progress" alongside completed work with no way to distinguish between the two, and you will very quickly shoot yourself in the foot. With WikiCode, you really do have to adopt TDD.
Likewise, WikiCode uses only automated testing. Like most teams, in our general development environment we run a mix of automated and manual testing. We automate all the tests we can so that our testers can spend as much time as possible finding the nasty, obscure bugs, manually. The drawback with manual testing, of course, is that you test one particular release of a product. When you make another release you have to do all that work again.
That's okay if you release every six months, but not if you want to release a one-line change now. We needed a system where we had enough faith in our suite of automated tests that when a change passed its tests, and all other tests continued to pass, meaning we'd not broken something else by mistake, then we could integrate that change immediately.
If you want to make a code change, you write a test to exhibit the problem, and then you write the fix. Of course, this system won't catch all bugs. There will be nasty, edge case bugs for which it's hard to write automated tests, because we're not doing manual testing before releases. However, if your system is this easy to deploy then when you find a bug, you can write a test and deploy a fix extremely quickly. In other words, the cost of such bugs should be drastically reduced compared to when using long release cycles.
WikiCode Enforces CI and CD
WikiCode enforces continuous integration in the traditional sense of the term. Rather than working on your own private feature branch, or just not committing for a few days, WikiCode shares changes, unconditionally, as soon as your tests pass. This helps to avoid integration hell, the misery of having to merge your thousand-line change with somebody else's thousand-line change. Instead, the way to work with WikiCode is by making a series of very small incremental changes that each makes the product better in some small way, with each successful change shared and deployed immediately. WikiCode actively encourages you to learn to write and save code quickly. Ideally, you'll break all the tasks down into units small enough that you can write the tests and the code and click save every few minutes, as opposed to every few hours.
WikiCode Use Cases
WikiCode uses TDD and works best with applications that lack complex, intertwined features. You want to be able to break down feature X, write some tests, done, write some tests for feature Y, done, and so on.
The primary use case that we envision right now is the internal business application, delivered as a web application. I have a vision where all web apps internal to the company have an "edit" button at the top corner of the page. When you click it, you drop into the WikiCode instance. You write a test for the requested change, and then make the change. Click "save", and when the test passes, the change is live.
When boss comes in and says "Can you make the TPS report headers yellow please?" you'll be able to say "Yes, two seconds," and mean it. That sort of flexibility makes people happy.
WikiCode Development Details
The name WikiCode is indicative of its intent only; we did not build it on a wiki. We implemented WikiCode in PHP. At the front-end, it's just a website, with a built-in open source code editor, called ACE. At the backend, we have a very basic, homemade "versioning" system that stores version history for files.
Why not Git?
Possibly, in the future, we could adapt WikiCode to use Git. However, in the time we had, it was easier to write what we needed from scratch (a few pages of code) than to take Git and meld it into what we wanted to build.
We used a combination of Selenium tests for the end-to-end tests, integration tests and unit tests for the individual modules. One interesting aspect of the development work was making the tests run as fast as possible. Every time a developer saves a change, all the tests in the suite run against their private working copy, so making these tests run fast was a key technical requirement. One thing we did was sort the test suite in order of runtime and made it stop on the first error, so if you have a unit test failing it will fail quickly. Most of the time a quick-running unit test will exhibit a failure in less than a second, and the slower integration tests will only be run when it is likely that all the tests will pass and we'll be making a release. Waiting 30 seconds to save is a bit long, but 30 seconds to make a release is amazingly short.
Of course, we used WikiCode to build WikiCode! When we built it, we had it self-hosted, so every time you made a change it would ship to itself. In other words, whenever we made a change to WikiCode, using WikiCode, and the tests passed, the version of WikiCode we were using would change under our feet, as it deployed the new version. This meant that it would have been quite possible to break it in a way where we could no longer edit it ourselves, which would mean we'd have to do a whole bunch of manual work to back out of that. Fortunately, we never broke it that badly and it gave us confidence that this crazy development scheme really might work!
The next area we'd like to explore is harnessing cloud-computing power with WikiCode. If you are building a new internal app, you're likely going to do it as a web app, and you're likely going to host it in the Cloud. I think that combination of hosting on the Cloud, and WikiCode for the version control, editing environment, CI server could work very well, for a number of reasons.
Firstly, we'd like to integrate WikiCode with some of the Cloud providers, so that users of WikiCode could create a new project, chose their technology stack, and get a deployed, unit and integration tested sample project within minutes, rather than days, or even weeks.
Secondly, the parallel nature of cloud computing could increase test speed dramatically. Right now, WikiCode is best suited for projects whose entire test suite could run in less than a minute or two. For projects that need to provide functionality that is more complex, you might consider breaking them down into a series of smaller projects, where each one doesn't need to test comprehensively the functionality provided by the projects on which it depends. However, in larger projects where this is not possible or desirable and running a full test suite could take several hours, you need to parallelize the tests and the answer there is simply to buy as much computing power as you need from Amazon, or another provider.
WikiCode works on the principle of making many small, gradual improvements to the system but it accepts that, occasionally, a release will 'degrade' the overall quality of the product. We'd prefer to take ten steps forward and one back than to take a single, bureaucratic step forward. It works just like Wikipedia: don't stand in the way of people making changes, just make it simple to find and correct any mistakes. You detect the problem quickly, write a test, write a fix, and so ensure it can't happen again.
If we had ourway, Red Gate would start using WikiCode, or something like it, to develop all of its new, internal web apps. When people request a small change, you're now in a situation where you can do that, almost immediately. And why not? It's a massive morale boost to know when you come into work you will, by the end of the day, have deployed some positive changes to an application.
As software developers, the only truly valuable thing we can do is write code and ship it to users. When there are too many steps between these two activities, we should take a cold, hard look at all the ones that aren't writing code, ask if they are truly necessary and if not find a way to remove them.
The authors would like to thank Sam Blackburn and Chris Hurley, the other two developers on the WikiCode project.