So you’ve got your database under source control, and now you want to start running automatic tests and builds with it. Welcome to the world of continuous integration. Rob Chipperfield guides us through the fundamentals of continuous integration servers, with advice on what to think about when choosing one, and insights into how your database and application structures can affect your choice.
If you’re just getting started in the world of Continuous Integration (CI) – or automated builds, perhaps – then choosing the right build server is an important decision. While you can move between different build servers later, it’s likely to add extra work that you don’t want later in a project, so a little research up front will pay off.
Of course, it’s equally important that you understand what you’re getting into if you’re hoping to integrate your project or SQL database into a larger CI process. Knowing why your organisation is using the current toolset will shed light on what you’re currently able to achieve and, while this might not save you the work of having to change products (depending on what you’re hoping to achieve), you’ll definitely be making informed decisions.
What we’re going to cover in this article applies equally well to SQL databases as application code because – ultimately – they should both be “just code” in your source control system. Heck, even your environments and infrastructure should be just code in your source control system, but I’ll come to that later. My point here is that, until you’ve got your database under source control, you won’t reap the full benefits of continuous integration. With that caveat out in the open, let’s get going!
Before we start diving into the material, bear in mind that this is not going to cover a blow-by-blow feature comparison, as there are plenty of more comprehensive resources out there for that. This is an overview of the fundamentals – an introduction to the underlying concepts and questions you’ll need to answer as you decide how best to continuously integrate your application and database code. I’ve pulled together some great resources at the end of the article for further reading, and once you know what you’re looking for you’ll find many more.
If you’re at a stage where you’re hoping to bring your SQL database into your CI pipeline, I’d strongly suggest you talk to your development teams or build master (if you have one), to learn more about how they’ve answered some of the questions we’ll cover below.
Firstly, consider what platforms your builds are likely to run on. This may seem obvious, but try and consider future requirements as well as the immediate ones. Most of the major CI servers will run on Windows, Linux and OSX, but their levels of support can vary. If you’re building primarily .NET-based applications, something which treats Windows as a second-class citizen will cause you more pain if your build processes become more demanding later.
On-premise vs. Hosted
Another major decision is whether to go for an on-premise solution, where you run the infrastructure yourself, or buy a hosted solution, where your builds run in the cloud on infrastructure managed by a 3rd party provider.
Your choice here will be dictated largely by the platforms you’re building for – for example, purely thinking about the application tier - many hosted CI services use Linux-based build agents, which are great for Node.js, Ruby or Java applications, but aren’t suitable for .NET. At the other end of the application spectrum, AppHarbor is dedicated to hosted .NET builds, but not other platforms, so if you have a heterogeneous development environment, you may struggle to find one hosted service that meets all your needs. Naturally, these considerations are different when it comes to thinking about how you’re going to test and release your database.
You should also check how much control you need over the build environment – does your application depend on third party products which must be installed on the machine building it, or can you bundle the relevant dependencies with your source code?
Most CI environments start small – perhaps a couple of builds – and tend to grow over time as you move more of your building, testing and deployment into it. Other teams may also see the benefits and start moving more projects into it (as you may be doing) and, before long, you may find you have hundreds of builds running.
This will obviously require more hardware, but watch out for licensing costs as well! Many CI packages are priced per build “agent” – effectively, the maximum number of builds you can be running concurrently.
If you don’t have enough agents, builds will be queued, delaying how quickly you get the results, and lengthening the feedback cycle between committing code to source control and seeing the outcome of a build and, potentially, automated tests. As with most things this boils down to a trade-off of cost against how rapidly you want your build cycles to complete, but it’s worth bearing in mind that faster build cycles will ultimately lead to better code. If you’re starting down this route, it’s worth having a realistic idea of the potential scope of your CI projects, and how rapidly you’d like to get feedback on your code.
Structuring your builds
Once you’ve chosen a CI server, consider how you’re going to structure your builds: do you want to build your entire application in one monolithic step, or each component independently? The former may be quicker to set up, but the latter can result in faster individual builds, thus speeding up the feedback loop if you have well-architected components.
If you have automated tests, you can either run these as part of the main build, or split them into their own builds which run asynchronously. This depends on whether you need to ensure the results of a build pass all tests before being declared a successful build, or whether allowing team members to get their hands on the output is more valuable. A hybrid approach can be useful here – run a small number of fast tests as part of the build, with a more extensive set running later (perhaps overnight) without holding up development in the meantime.
The source control system you use, or are planning to use, will also affect your build setup. Subversion, Vault or Source Safe tend to encourage most development on a single branch, with an associated build; more advanced systems like Git or Mercurial allow a lot more flexibility, with excellent support for development of features and fixes on separate branches, and a good CI server will be able to build these branches automatically as well without substantial extra setup, giving you confidence that your work will integrate well with the main branch when it’s merged back.
Getting an existing application ready for Continuous Integration
If you’re starting a new project, getting it running on your CI server should be one of your first tasks but, in reality, most of the time you’ll be setting up an existing application on your new CI server.
If your current build process involves any manual interaction, you’ll absolutely need to make that automatable as fast as possible. You should be aiming for a one-click build on your desktop machine before moving it to the build server. Installers are a common pain-point here but, more generally, remember that if you currently have to click through a wizard to build your app, you won’t be sitting in front of your build agent’s console when it builds on the CI server. Automate everything you reasonably can.
The next challenge is to tease out the build dependencies: does your build assume certain paths on disk, or applications installed? Clearly, the fewer hard-coded paths and assumptions are baked into your application, the more reliable your build will be – not only on the build server, but on colleagues’ machines which may not be set up the same as yours.
Most CI servers provide a way of passing variables into your build; you can use these to specify environment information without hard-coding it into your application’s build script, making it more maintainable.
One word of caution – CI servers all provide a rich set of functionality to help manage your build process, and a lot of this can be very helpful in getting it up and running quickly and effectively. However, the more you use, the more tightly your build will be tied to that product, and the harder it will be to move to a different solution in the future. You also risk the build on the build server diverging from developers’ local builds, which can make debugging unnecessarily challenging.
Interview with the Database Delivery team
Here at Red Gate we have a team working on bringing continuous integration and continuous delivery to SQL Server databases. As part of that process, they’ve started getting their hands dirty with several of the most well-known CI Servers, so I asked a few of them to tell me about their experiences. Ruchika Raj, David Conlin & Thomas Hodgson pooled their notes and sent me this:
“First of all, bear in mind that we were primarily looking at how well we could integrate SQL Server into the CI and deployment process, so this isn’t going to be a rigorous breakdown! We also don’t want to play favourites. We’ve each had different experiences with some of the tools, and we all have our own preferences. However, a lot of those preferences boil down to personal taste and how we like to work.
If you’re choosing a CI Server for yourself – particularly if you intend to bring your SQL database into the pipeline - you’re definitely better off thinking about your own needs, and investigating feature matrices and more targeted tear-downs to work out what’s going to fit. That said, we’re more than happy to give you a quick taste of what we found when we started looking into the available options for SQL CI, looking at similar factors in each case.
Team Foundation Server (TFS) –it’s Microsoft’s own CI server, so naturally it plays nicely with .NET. It’s also pretty scriptable and flexible. For example, TFS calls msbuild to build Microsoft projects, but its flexible in that the workflow is customisable, and you can download and implement custom arbitrary workflow tasks. If you’re looking at getting your SQL database into your CI pipeline, we supply sample build scripts and provide a .PROJ file to build databases from msbuild. In terms of the bigger development / deployment picture, TFS supports git and its own TFS version control, has an inbuilt bug tracking system, and links to Release Management - Microsoft’s deployment tooling - although that last step is a new one for Microsoft.
TeamCity – this tool from JetBrains actually has some similarities to Jenkins and Atlassian’s Bamboo – although perhaps it would be more accurate to say that they are all equally good tools with relatively convergent feature-sets. TeamCity has a very mature plugin framework, and Red Gate actually already have a plugin that allows you to integration your SQL Server into your CI process – hurrah! It has a pretty broad support for various languages and development frameworks, as well as a thriving community (plus a paid support option if you need it). As a bonus, there’s also a free edition of TeamCity, if you only need a small number of build agents.
Jenkins- an Open Source option, which may be a point for or against it, depending on your perspective. It also has an extensive plugin ecosystem, although you can get an awful lot done using just the command line. From the perspective of SQL CI, at the moment the only option is to call the build and test steps via said command line, but that can be scripted. Jenkins has a pretty sizeable community around it, which means that there’s plenty of documentation and tutorials, but it also means that you’ll be relying on community support if something goes wrong – no paid options here.
Bamboo – Atlassian’s tool is actually similar to Jenkins from our perspective, in that it’s possible to set up SQL CI via the command line, although there is a pretty solid plugin ecosystem which you would do well to check out. What makes Bamboo particularly interesting is that it has its own built-in release management tools, and obviously integrates tightly with JIRA bug-tracking software and the rest of the Atlassian development suite.
These are just some of the more common options out there, and they’re the ones that we’ve looked at from the perspective of SQL database continuous integration – rather than worry about which tool to go with, we’d definitely suggest you think about your plans and needs, and then start Googling once you know what you’re looking for!
Getting the most out of your CI Server
Getting your application compiling and database building automatically is just the first (already incredibly valuable) step – albeit one that may well take a lot of work – the more time you invest in continuous integration, the more value you can get from it.
Testing is essential
If you have an automated test suite, run it as part of your build (and, obviously, your suite should cover the whole stack.) Most CI servers include reporting on the number of tests that pass and fail, and can help link test failures to source code changes in order to more quickly identify issues.
By running tests on a regular, repeatable basis, you can start tracking other metrics as well – for example, is your test coverage improving over time? One team at Red Gate actually fails builds if new code causes the test coverage to drop compared to the previous build – a great example of using monitoring to encourage the development behaviours your team is aiming for!
Of course, you should be selective and realistic about which metrics you track and why. Simply tracking test coverage for the sake of it may well lead to behaviour and results that fulfil the metric over what you actually want, which is high quality maintainable code to drive a releasable product.
Maximise your Source Control
Your source control system is not just a place to store your code when you’re not working on it – it’s a way for your teams to communicate, and to make sure your application and databases are always ready to release. Continuous integration can work hand-in-hand with your source control system, allowing you to better work on multiple independent features or fixes at the same time. More importantly, it can also minimise the risk of bugs making it into production by allowing you to build and test continuously, on a known environment, away from the variability that comes with building on a developer’s desktop machine.
That all said, continuous integration is only the beginning – by extending your CI solution further into deployment, you can push new builds that pass your rigorous test suite right out to customers with a single click once you’ve signed-off the release – a practice commonly referred to as Continuous Delivery.
Continuous Delivery may sound challenging, and indeed it is – if you try and leapfrog over all the preparation, automation, and test orchestration that needs to first go into Continuous Integration. If you’re looking at these ideas from the perspective of manually intensive build, test and release processes, then of course Continuous Delivery looks hard, painful and risky. From where you are now, it is. However, once you’ve got your application and database code continuously integrating, the next steps are surprisingly easy to take.
By automating the whole process it becomes faster, more repeatable, and less risk or error-prone. If frees up everyone on the team to do more of the things that humans do better than machines – like solve problems, design database schemas and architect applications - allowing you to deliver better quality software to your customers and get all the value your team is generating out of source control and into production as fast as possible, with as little risk as possible.
But first, you need to set up the right CI server so, now that we’ve glanced over the fundamentals, take a look at these extra resources to dive deeper, and start planning!
- Comparison of CI Servers – StackOverflow
- Comparison of Continuous Integration Software - Wikipedia
- Database Continuous Integration with Bamboo – Richard Morris & Sarah Goff Dupont
- Database Build and Release with Jenkins – Ernest Hwang
- Continuous Integration for SQL Server using Red Gate Tools, Atlassian Bamboo, and Github –Ben Rees
- Working with Continuous Integration in a BI Environment using Red Gate Tools and TFS – Grahaeme Ross