My last article detailed the first, tentative steps we’d taken at Farm Credit Services of America (FCSA), in order to introduce DevOps workflows into our organization. I described the need to define a common DevOps goal, which in our case was as follows:
“What needs to change in our processes in order to be able to deploy to pre-production every 45 minutes?”
With a goal in hand, we needed a strategy to achieve it, and then to break down this strategy into series of manageable tasks that involved all of the teams.
It’s very tempting to strike fast while enthusiasm is high, to restructure teams, dissolve barriers, dive into accomplishing your strategy. The reality in most organizations is that progress comes in smaller steps. It’s slower, more mundane, but necessary.
This article describes the next, modest phase of our DevOps program at FCSA. It involved setting up a DevOps workgroup, numerous discussions that attempted to refine the teams’ initial ideas into a concrete set of short and longer term goals.
During these early stages, be prepared for goals about which you’re passionate, such as being able to deploy every 45 mins, to be challenged by others, and perhaps change as a result. That is perfectly okay, the end goal is really to get the conversations started then sustain them and keep the project moving forward. And boy, did they start.
Why a new Workgroup instead of a Reorg?
My previous article spoke of our desire to form a new “DevOps” workgroup, out of the existing Octopus Deploy workgroup. The latter formed as a natural consequence of our earlier adoption of Visual Studio Team System (VSTS) as our build system and Octopus Deploy as the deployment system, but we now had a new, broader focus.
In our research of how DevOps is adopted at other companies, we found that the typical approach was often a little more drastic, and involved dissolving the disparate operations and application development teams and reforming new teams consisting of operations and developers.
That is a great approach, but is not feasible at this time at Farm Credit Services of America. We have over ten development teams but only five web admins and five DBAs. United States farmers (our customers) are facing some tough times in 2017. It is a tough sell in this economy to ask to hire ten additional people. Even if we didn’t hire anyone new, a complete team reorganization costs a lot of time and money, and of course, at this stage there is no telling whether or not this DevOps initiative will actually work!
Rather than focus on what we couldn’t do, we turned our focus towards what we could do. We could still adopt some DevOps principles; we just had to work within the constraints we had.
Kicking off the DevOps Workgroup
For the inaugural meeting of our new “DevOps Workgroup”, we invited all the members from the original Octopus Deploy Workgroup, but also encouraged each member to invite one or more people who they felt should be involved. We ended up with quite a full room; 25 people in all. Everyone agreed that this was too many people. In the past, we have found workgroups work best with around ten people. However, there was a lot of passion across the many application development teams, and we definitely wanted to find a way to include those individuals.
One of the issues we’d found with the previous Octopus Deploy Workgroup was that it was a little ‘insular’; often, only members of the Octopus Deploy Workgroup worked on solutions. For example, they would work on a solution to make SSIS package management easier, perhaps because that was a particular pain point in their team. Developers in others teams, who were not workgroup members, often suffered similar issues and had good ideas about how to solve them, but felt they had limited opportunity to contribute to the chosen solution.
This was never done intentionally, if anything it was simple human nature to not want to “step on other people’s toes”. Nevertheless, we wanted to fix this. We wanted to ensure that all developers with a passion for solving problems, across all the development teams, had opportunities to contribute.
Flexible Workgroup Membership
As it turned out, we had already stumbled, almost by accident, on what now felt like a much better and more inclusive model for the new workgroup. A couple of months previously, three developers had approached the Octopus Deploy Workgroup to ask if we could automate SSIS deployments. Due to other priorities, the majority of the members of the workgroup could not help out. As a result, those three developers took the initiative and started working on the solution. Members of the Octopus Deploy Workgroup made themselves available to answer any questions or offer any pointers. While working on their solution, the three developers attended the Octopus Deploy Workgroup meetings to give status reports and demonstrate their progress. When they finished, they wrote up the necessary documentation, and communicated their new solution to the development teams, and then, for the time being at least, stopped attending the Workgroup meetings.
This formed the basis of our new working model. The DevOps Workgroup would retain a manageable number of core members (ten), but would work in a more inclusive way. The core workgroup would decide what goals need to be worked on next and help identify any additional areas needing additional focus. Having identified the next goal, the Workgroup would communicate that to developers and operations, asking for help.
Anyone who volunteers, with a passion to help solve this particular problem, will join the workgroup meetings, and help implement the solution. Once the project is complete, they will stop attending the core workgroup meetings.
The makeup of the Core DevOps Workgroup
The job roles for the core DevOps Workgroup are:
- One Leader
- Two Lead Developers
- One DBA
- Two Web Admins
- Two Developers
- One Database Developer
From the looks of it, conspicuously absent from the workgroup is any representation from the QA and Infrastructure teams. At first, there was some shock about this seemingly exclusion. I got a couple of emails within ten minutes of the announcement of the DevOps Workgroup.
This is temporary and deliberate. For starters, there are a couple of other workgroups which require QA resources, and we do not want to stretch them too thin. Secondly, as a workgroup, we have identified a lot of goals primarily around database and web deployments. We want to get solutions for that and bring in the QA and Infrastructure teams as and when it makes sense. We will bring QA people once we start focusing on our testing goals. Infrastructure will be included once we start focusing on infrastructure as a service. We want to bring in the right people at the right time and not waste anyone’s time.
Refining our DevOps Goals
“What needs to change in our processes in order to be able to deploy to pre-production every 45 minutes?”
That was our conversation starter, and it led to a lot of great ideas on what we needed to change, in terms of application, database, testing, infrastructure, and organizational education.
Using Octopus Deploy, we’d already made significant improvements to our application deployment process, but it still wasn’t ideal. For example, when deploying new code we still first had to stop the application pools on all the servers, deploy, then restart them all. What we really wanted was the ability to perform rolling deployments, with zero downtime.
This meant a move, over time, towards small, more modular services, which would allow us to deploy a single component without having to shut down the entire application. In the immediate term, we wanted to evaluate Service Fabric, the technology on which Azure is built, since it seemed to provide a lot of the functionality we needed, in terms of scalability and ability to support rolling deployments with zero downtime.
On the database side, a lot of manual database steps need to be automated, streamlined or removed in order for us to get anywhere near being able to deploy every 45 minutes. We had all sorts of ideas for what we needed to tackle in the short and longer term, including better configuration management, database change script verification, more effective load testing, and more.
Finally, we recognized the need for a lot of cross-team education. Removing manual control from these processes is a mind shift for a lot of people and can be daunting. For this to be successful, we needed much more effective monitoring tools in place, which could help detect issues before they become problems. Operations and developers need to cross educate one another, with operations teaching developers about the tooling, and developer teaching operations the necessary domain knowledge to effectively use the tool.
In addition, some of the new tools we’d be adopting, like Service Fabric, would change fundamentally how applications are architected, from the code to the database. Lessons learned by one team would need to be communicated to other teams, via email, blog posts, as well as more structure “lunchtime learning”.
Over the course of a number of meetings, we resolved all of this thinking into short, manageable lists of short and longer-term goals.
Short Term Goals
Our short term goals were chosen both had to help us towards our broader goals and, more pragmatically, be tasks that we could prioritize and work on and immediately.
- Automate Configuration Settings Deployments – being “Dogfooded” right now
- Automate SSIS Deployments – ready for adoption, hold over from the Octopus Deploy Workgroup
- Improve Delta Script verification – completed (described in a separate article)
- Automate Database Integrity Checks – getting started
- Evaluate Service Fabric – in progress
- NewRelic alerting – getting started. A recent article on my website covers our first set of alerts.
- Automated Server Verification – not yet started
- How to support for multiple Service versions – getting started
- The need to automate deployment processes for all Critical Applications and their components – this would be using Octopus Deploy and Redgate DLM Automation
Long Term Goals
Our long-term goals were as follow, some of which had dependencies on the short term goals.
- Load Testing – See what the database can handle
- Automate Smoke Tests – ability to deploy to Service Fabric with automated smoke testing
- Automate Business Verification Tests – requires collaboration with business owners regarding the required post deployment tests
- Automated Rollback on Failure –Service Fabric can support this, hopefully
- Load Testing – seeing what the web servers can handle
- Proactive Performance Monitoring – ideally we can make use of NewRelic
- Automated deployment for all FCSA Applications and components
- Writing code to support multiple database versions – requirements and practicalities
- Automate Disaster Recovery – requires all Critical Applications and their components using Octopus Deploy
- Automatic Server Provisioning – implement infrastructure as a service
Helmuth von Moltke once said, “No battle plan survives contact with the enemy.” That is something to keep in mind as we move forward with our DevOps workgroup. Almost everything described in this article will change as we learn more. That is a given. We want to be flexible enough to adapt, so that the DevOps initiative doesn’t slow down, or even worse, come to a complete stop and get abandoned. I’ve been encouraged by the work done so far.