Richard Morris interviewed Michael Stoop, a database developer at Calvi, Europe's leading provider of Telecom Invoice Management software. The discussion focused on how Calvi transformed their database delivery process to accommodate massive database growth, statutory regulations, and developments in their application. Here's their story.
When an organization faces the challenge of keeping a database-driven application current in a wide range of versions in a number of sites, the solution isn’t so much the acquisition of a magical tool as a cultural change that allows them to exploit a number of third-party tools as well as the features in SQL Server itself. This change has meant that there are no longer any unexpected differences in the data model between what the developer is working with, what the tester is working with, and what will eventually occur on a production database - as was the case with one organization in particular, Calvi.
The reason for these changes were that Calvi were finding it increasingly hard to keep pace with changes in statutory regulations and developments in their application as their success grew, and they realized that they needed to rethink the way that their development process was organized. The company faced unusually-rapid changes because they specialize in invoice management for Telecom operators as well as Telecom Expense Management for enterprises. They have six developers in the Netherlands and an additional team of eight in India. Between the two development teams, they work on three major web applications as well as a variety of interfaces and processing engines. They host the majority of their customers’ software themselves and the remainder is on local customer-based installations.
In the past, the high volume of changes and customer restrictions meant they found it hard to keep pace and ensure that each customer was running the latest version of the applications and database. This situation often led to problems when upgrading their applications to a new version since they couldn’t always rely on the database schema being at a particular known version.
In the past, updates to the data model were often performed as SQL scripts in a certain version of the application. The subsequent deployment of these changes to all copies of the database meant a lot of error-prone manual scripting which in turn led to having to rely on extra support from developers.
A “small change”, carefully performed as an ‘uncontrolled release’ by support or a developer under time pressure, could have disastrous consequences later on. If, for example, a consultant might add an index to one particular database on site in order to solve a quick performance problem for a customer-specific export, but this change was not recorded. A while later, a data type might need to be expanded from an INT to a BIGINT, because a field that records the size of data in Kbytes now needs to record the size of more than 2TB of data. An update / migration script would be created based on the current database as scripted out as the production version, but it wouldn’t contain that index. The customer would either lose their performance-gain from the index, or worse, the entire migration script might fail because of the existence of related objects.
Without a lot of care and vigilance, minor changes by one team could break a feature for another.
Upgrades were made more difficult by the size of the database. The production database contains around 325 tables and 3300 columns. A handful of those tables are in the multi-hundred-million rows, but tended to range from 50 million to a few hundred million records. When possible, the size of the database is reduced by horizontal partitioning – spreading customers over different databases when no global reporting is needed. This allows the databases to still be manageable and maintainable within a short timeframe.
They realized that they needed a precise database versioning system, and to change the way they developed new versions so as to minimize these problems.
Introducing database versioning
The biggest challenges they faced were in transforming a multitude of different and un-versioned databases to a single design where the database version was identifiable by Calvi’s applications so they could always ensure compatibility
The first step towards solving this was to start working with a ‘model’ version of the database, so all new copies of that database had at least the first base. Next was the task of migrating the installed databases from their current unknown version to one that was at a numbered version. Red Gate’s SQL Compare played a huge role in this by helping to synchronize the code.
From then on, each and every database change has been scripted out, carefully filtered, and placed in a numbered migration script that makes the change from a particular version to another, and finally, as the last step, updates a version table within the system to hold the current version of the database.
This is a very safe way to ensure database version reliability, but requires disciplined work. It requires some effort to maintain 100%.
Adopting source control and Agile processes
Calvi now use a Scrum process supported by TFS for their software changes. The database changes for a sprint are scripted near the end of each sprint as a pre-release script (or set of scripts).
The QA department run these consolidated scripts on the latest version of the database and do their testing on that version. The development teams also get this latest version so they integrate their database with changes from the other teams.
Currently, with many of the teams working on overlapping table changes, master / domain data and stored procedures, it has become increasingly important to have help from the right tools to save time and improve quality.
Calvi use SQL Source Control to maintain the development model database and generate final scripts from rapid development and performance tuning. In the current process it’s not 100% implemented in their development teams yet. The final functional check is used in manually checking the incoming scripts on each sprint release and performing those on the version database under source control.
Michael Stoop believes that this type of tool is much easier to work with than those source control systems that merely check in SQL scripts or compare static data models:
‘For a proper SQL database versioning process, there’s really no alternative to SQL Source Control.
You need a dynamic system that works without bothering you with problems on checking in. It’s frustrating for developers to have to script each and every change.
Developers tend to occasionally make table changes by drop-creating them, which is very risky on a production system. Fortunately that’s something SQL Source Control eliminates by only working with the delta of the final form a table gets.’
With the new development culture in place, Calvi have found it much easier to exploit the value of third-party tools. For example, they now use SQL Doc 2 to generate documentations on new parts of the data model, so their developers and quality teams have documentation available to them quickly, without having to spend weeks on writing it up or worse – developers having to do all sorts of assumptions on functionality based on table and column names.
SQL Compare has been a lifesaver for them in going from unknown to known versions of the many installed databases. There had previously been some manual work attached to it, but nowadays the process is pretty much seamless. In a sense, it has become less essential now because its use, as part of the development process, has more or less been replaced by the disciplined use of a SQL versioning process though it still gets used occasionally when a database version has been “corrupted”; in other words where someone made changes and forgot to make scripts. Other tools get well-used: SQL Dependency Tracker 2: helps quickly investigate what a change to a table, UDF of stored procedure would imply for the other teams and developers: SQL Data Generator 2 is used to load up tables for heavy performance tuning and load testing. Their system admins and DBAs also use SQL Backup Pro.
As Calvi has grown as an organization, they’ve established a professional attitude towards SQL version management. They are planning to make more changes to this process to aid the QA department, testers and web developers with database changes, as well as support them with continuous integration. SQL Deployment Manager will probably be used here.
Michael Stoop, a database developer at Calvi has the final word:
‘By having a more effective database development process, we decrease the number of iterations needed for development and testing. Additionally, with a solid database development process – supported by the Red Gate tools, we’ve drastically cut down on the number of bugs in the datamodel. This allows the development team to focus on development of new and better products.’