Note, this article was updated by the author on April 9, 2015.
This article describes how to integrate SQL Server databases into an existing build automation process, such as continuous integration, and how Redgate tools can be used to automate the process.
Continuous integration relies on managing changes in a version control system, such as TFS, Subversion or GIT using SQL Source Control. SQL Source Control is an add-in to SQL Server Management Studio that source-controls your database schema and static data. When changes are checked into source control then the continuous integration process is triggered.
This can do any of the following:
- Build a database package deployment artifact from source control
- Keep a database in sync with your latest changes
- Run automated regression tests
- Automate the generation of database documentation
- Publish the database package for use in a Release Management process
I will not be discussing in this article how you build your application code or how to setup or configure the application that you’re developing. At the end of the article I will provide example using plugins for JetBrains TeamCity, but rest assured that any CI tool can be used in conjunction with the Redgate SQL CI command lines.
Why continuous integration?
Continuous integration (CI) is the process of ensuring that all code and related resources in a development project are integrated regularly and tested by an automated build system. Code changes are checked into source control, triggering an automated build with unit tests and early feedback in the form of errors returned. A stable current build is consistently available, and if a build fails, it can be fixed rapidly and re-tested.
A CI server uses a build script to execute a series of commands that build an application. Generally, these commands will clean directories, run a compiler on source code, and execute unit tests. However, for applications that rely on a database back-end, build scripts can be extended to perform additional tasks such as creating, testing, and updating a database.
The following diagram illustrates a typical integration process. The automated continuous integration process begins each time the server detects a change that has been committed to source control by the development team. Continuous integration ensures that if at any stage a process fails, the ‘build’ is deemed broken and developers are alerted immediately.
“Continuous Integration is a practice designed to ensure that your software is always working, and that you get comprehensive feedback in a few minutes as to whether any given change to your system has broken it.”
For many software projects, this will include a database. The automatic build process should include the retrieval of all the database components from source control demonstrate that this can be built into a working database, and storing a validated deployment artifact (in our case a database package) that can later be used for further deployments. However, this is not always simple, which is why this article seeks to clarify the process of integrating databases into an existing automatic continuous integration process.
How are databases different?
Database code should, in principle, be treated in the same way as your application code. However, the principal difficulty underlying continuous integration for databases is that it is difficult to keep a database in source control and deploy it to a target server.
The database is unlike application code in as much as it contains state, in terms of table-based data, that needs to be preserved after an upgrade. Also, a database often needs other components than the database DDL, such as jobs, alerts, replication settings and Agent-based PowerShell or SSIS scripts, and may rely on server-wide settings. Where a production database already exists, DML and DDL queries modify the existing state of a database, and unlike for application code, there is no source code to compile. Deployment therefore relies on creating upgrade scripts specifically for that purpose.
Although it is easy to store the source of the component database objects such as tables, views and routines, it is not so easy to script all the components needed to build a working database on a freshly-built SQL server.
Migration scripts may contain ALTER and UPDATE statements to update the target version of the database with the development version; alternatively, the scripts may create a new database. Where changes are deployed to an existing database, all differences and dependencies must be accounted for. In some production deployments, this involves multiple targets with different schemas and data. In either case, the manual process is time consuming, prone to errors, and one that should not be left unresolved until the end of the project cycle.
Object creation scripts can be generated relatively simply (for example using Microsoft SQL Server Management Studio), but referential integrity is difficult to maintain. Objects and data must be created and populated in the correct order, and as dependency chains can be complex, third party tools are often required. Data migration and test data creation are tedious and time consuming operations when performed manually.
Database Lifecycle Management
Database CI is an important part of Database Lifecycle Management (DLM). Continuous integration builds and tests a database, validating the change scripts that will eventually be the constituent parts of the overall deployment script supplied to the DBA for deployment. Once deployed to the production server, monitoring should be put in place to ensure that not only the performance of the database remains acceptable, but also that the end user is getting value from the changes.
In the event that a change is made directly to the production database bypassing the formal development and test processes, database drift monitoring tools such as DLM Dashboard ensure that the DBA and development manager are notified. The new change can either be undone by the DBA, or if approved, copied into the development environment.
Database continuous integration
Building a database
The SQL CI Build step serves two main purposes. Firstly, to create a database package artifact that is to be used in subsequent CI tasks and later in deployment, and secondly, to run a validation test on this artifact by using it to “build” a new database.
One might question the purpose of building an entirely new database in the cases when only updates to existing databases are occurring in practice. The fact is that the act of attempting to build a new database provides important validation test in itself. Database objects can often get themselves in invalid states, for example when objects reference others that have since been renamed or even dropped. By generating a creation script and attempting to run it to build a database from scratch, such issues are quickly identified. The build step also serves to validate that the database creation script is successfully building a database that includes all the expected objects. This is particularly important to combat human error, for example when hand-crafted scripts have been included as part of your database using the Migrations Scripts feature of SQL Source Control and SQL Compare.
Keeping a database in sync
Databases may figure in your CI process simply because the application code requires there to be a database present to function correctly. The database schema version corresponds to an analogous application code version. Any changes to the application code or the database structure could in theory break the system and should consequently trigger the CI process.
The SQL CI Sync task is able to maintain am up-to-date database from the database source files to accompany the application build.
Equally, if you already have internal test databases that need to match the latest changes in the development environment, you can keep them up-to-date in a similar fashion.
The Sync task also serves to test the upgrade process as each incremental change is committed to source control. Unlike for application code, upgrading a production database isn’t a simple case of replacing it with a fresh copy. Databases have a mission critical state that needs to be preserved.
This doesn’t replace the recommended practice of running formal upgrade tests, for example against a staging database, but these “micro upgrades” will catch the majority of issues far earlier in the project lifecycle. A Pro Tip is to run this task with the “/AbortOnWarnings:High” SQL Compare option configured, ensuring that the build will abort on any high warnings, which include operations that result in data loss, such as dropping tables and columns and truncating fields. Intentional data loss changes are instead made by way of Migration Scripts.
Unit testing database code
Although it is best practice to test application code as part of a CI process, database code and its accompanying business logic is often overlooked. Database code comes in the form of stored procedures, functions, views, triggers, and CLR objects. If it is deemed important to test application code as part of CI, the same must apply to the database code.
Fortunately there are open source frameworks that can be used, for example NUnit for .NET unit testing and tSQLt for SQL Server testing. tSQLt unit tests can be easily developed, run, and managed with Redgate’s SQL Test, a SQL Server Management Studio add-in. The SQL CI Test task will execute any tSQLt tests associated with your database, outputting a JUnit xml or .trx results file, a standard format that is supported by all good CI tools.
Populating the database with test data for the unit tests is also a very useful feature. Redgate’s SQL Data Generator can be used as part of the build process to fill the test database with sample data for the unit tests to run against. Simply save off a SQL Data Generator project file, include it as part of the VCS root that the CI tool checks out so it can be referenced from the SQL CI Test task.
Generating database documentation
Development databases undergo frequent changes. To keep track of the state of the database, it can be useful to document the schema. The CI process is an effective way of generating database documentation files as a build artifact. This is achieved by using a SQL Doc project file checked into source control as described in the following article: How to Automate your Database Documentation.
Publishing a database package
Deploying different versions of a database to different environments and keeping track of changes can be difficult. The CI process can optionally be configured to make the validated database packages available to your Release Management tool. How this is achieved is very much dependent on which Release Management tool you use. For example, market-leading Release Management tool, Octopus Deploy, uses a NuGet feed as its package repository, which can be published to using TeamCity’s built-in NuGet Publish task: http://confluence.jetbrains.com/display/TCD8/NuGet+Publish.
Managing the database deployment
Once validated in a CI environment, the database (and application) changes need to go through a release process.
In many ways continuous integration can be considered a dry run for production deployment. But although the CI environment often mirrors the production environment as closely as possible, it is rarely the case for the database.
Production databases can be huge, and it is therefore impractical to restore a production backup to the CI environment simply for testing purposes. Test environments rarely benefit from the same storage capacity as for production and pre-production environments, and lengthy restore times make recreating the CI database impractical.
It is therefore prudent to push these changes through some final “realistic” test phases using staging and UAT environments.
A database package contains the validated deployment package, which includes the source schema state, allowing for a post-validation check once deployment has taken place. The deployment script itself is generated from the database package as part of the release process, validated, tested, and optionally modified by the DBA before being tested on a staging environment prior to production deployment. As part of the release process, it is important to check that the deployment targets databases haven’t since “drifted” from their expected state since the deployment script was validated.
How Redgate tools help
SQL Source Control
- SQL Source Control is a SQL Server Management Studio add-in that helps maintain database schema and data in your existing version control system
- Allows for the creation of custom migration scripts which are saved to source control and used to supplement the deployment script generation process
- The SQL CI and SQL Release command lines and Build tool plugins to set up continuous integration for databases
- Allows developers to easily create, run, and manage tSQLt unit tests on databases
SQL Data Generator
- Generates realistic test data tailored on your existing schema
- SQL Data Generator project files are referenced from your CI process
- Automates the generation of database documentation
- The SQL Doc command line and project files can be used as part of your CI process
- A free tool that alerts selected users when database deployments and drift occurs
- Maintains a historical audit of all schema changes
- Ensures that your production database performance is as expected
- Using a custom alert, be notified when schema changes are detected
The TeamCity Plugin makes configuring CI simple. It eliminates the need to build complicated command line scripts by hand. When it’s installed in TeamCity, you can select Redgate SQL CI tasks as the runner types for your build steps.
Redgate SQL CI Build
Add a build step of Runner Type Redgate SQL CI Build:
Then specify an appropriate Package ID of your choice, and supply the connection details for a SQL Server upon which the validation check will be performed.
To preserve the Database Package as an artifact in TeamCity, simply specify *.nupkg in your Build Configuration Artifact Paths.
Redgate SQL CI Sync
Add a build step of Runner Type Redgate SQL CI Sync:
Again, specify the same Package ID used in the previous build step, and the connection details for the database to keep in sync.
Redgate SQL CI Test
Add a build step of Runner Type Redgate SQL CI Test:
Again, specify the Package ID used previously, the server details where the temporary test database is to be created (this is torn down after each run), and optionally a SQL Data Generator project file.
To integrate the test results with TeamCity, simply add an XML Report Processing build feature specifying the Ant JUnit type with *junit.xml as the Monitoring Rule.
For each of the above SQL CI tasks it is possible to specify SQL Compare command line switches to customize the comparison settings.
Further reading and resources
You can read more about continuous integration in the DLM Patterns and Practices for continuous delivery of databases on Simple-Talk. You can also download a whitepaper about continuous integration using Redgate tools.
This article has outlined some best practices and worked examples for implementing databases as part of your CI development process. As with application code, database code is managed in version control using SQL Source Control, and automatically deployed to a CI environment. Redgate command line tools handle the scripting and deployment process, removing the hurdle that has previously obstructed CI and source control for databases. A database package can be created and validated as part of the CI process and later, with minimal effort, deployed through your dev, staging, and production environments using a Release Management tool.
Continuous integration (CI) is the process of ensuring that all code and related resources in a development project are integrated regularly and tested by an automated build system.
Read the new whitepaper on continuous integration for databases using Redgate tools.