Click here to monitor SSC
  • Av rating:
  • Total votes: 10
  • Total comments: 1
Grant Fritchey

Why Put Your Database into Source Control?

08 July 2014
Database Lifecycle Management Patterns & Practices Library Stage 2

DATABASE LIFECYCLE MANAGEMENT PATTERNS & PRACTICES LIBRARY

Stage 2: Version Control

Checking program code into source control is a daily ritual for most developers, but versioning database code is less well-understood. Grant Fritchey argues that getting your databases under source control is not only vital for the stability of development and deployment, but it will make your life easier when something does go wrong.

Software development is a difficult and demanding discipline. It gets even more difficult when you bring teams of developers together on a single project. One of the fundamental issues arises around the code itself. Who is responsible for what piece of it? What changes have they made? How do you get those changes from one member of the team to the next? How do you keep a history of each change, in case it causes problems later that you need to track down and fix? The answer to all these questions, and many more, is to manage your code through a source control system.

Source control systems, also called version control systems (VCS) or revision control systems, date back to the birth of modern computing. One of the first be developed was in 1975 when there was finally enough disk space to store a second copy of the program, just in case. Since then, getting application code into source control has gone beyond being an industry standard practice to simply a part of programming, like writing a function or an IF clause. Yes, there are some shops or individual developers who don't use source control for their code, but those are the glaring exceptions that prove the almost universal rule that code goes into source control.

But databases are different.

The history of databases diverges from the history of code. At some point developers were no longer responsible for databases. Instead they moved into the hands of system administrators or dedicated database administrators, who, frankly, looked at the world quite a bit differently than developers. They spent time worrying about backups, availability, integrity and performance.

In many cases, database development work moved into the realm of the DBA. While many DBAs came from the ranks of developers, they spent more time worrying about all those administration tasks than the development tasks, and some of the best practices and methods created for managing code just weren't applied to databases. But they should be.

Backups

DBAs are very good at putting in place backup schemes that will protect the production data. However, when working with the code of a database, and the SQL that defines data structures and stored procedures is nothing but code, a full database backup is an unwieldy device by which to maintain copies of the schema, for retrieval of changes and historical tracking. For example, to find out what changed between the previous and current versions of a stored procedure a DBA would be forced to use a third party tool that could directly compare to a backup, or to run a full restore of the database to a secondary location, and then extract the stored procedure definition. This isn't always possible, it's frequently impractical and it's certainly going to be slow.

Getting a database into source control provides a much more efficient mechanism for backing up the SQL code for your database. Retrieving a previous version of a stored procedure from entails simply inspecting the history of changes within your VCS. Retrieval is nearly instantaneous.

Once you realize that your SQL is code, it immediately makes sense to use the same backup mechanisms that code uses, which is a VCS.

Auditing

Within most database management systems, it's possible to find out when an object was created or last modified, and which login performed that action. However, there is usually no historical record of any previous modifications to that object. Further, depending on the security mechanism within the database, you may simply see that a system administrator or database owner made the change, with no indication as to the actual identity of the person working within that role.

If you have your database in a VCS, and use that VCS as a fundamental part of your development and deployment mechanisms, then it will provide exactly that type of tracking. All changes originate in the VCS and are not made directly against the production system outside the process around your VCS. You'll know who made what change and when it was made.

Many organizations have to comply with legal requirements for change auditing, such as those mandated by Sarbanes-Oxley. Implementing a VCS could be the quickest and easiest way to provide the required level of historical tracking of all changes so that for every change to the database you know who did it and when.

Integration

As soon as we enter a new code file into the VCS, it assigns it a version. Each time we commit a change to that file, the version increments, and we have access to the current version and all previous versions of the file. When we put a database into the VCS, this means that every database object (table, view stored procedure and so on) in the VCS has a version number. We can also create labels, or tags, that allow us to assign meaningful "build number" to the set of files that comprise a particular version of a database.

Furthermore, having the database in source control directly alongside the application will integrate the database changes with the application code changes, so that you'll always know that the version of the database being deployed directly corresponds to the version of the application being deployed. This direct integration helps to ensure better coordination between teams and it can help when troubleshooting issues.

Automating Deployments

If all changes needed for a production system are in a development database somewhere instead of inside a VCS, deployments are necessarily going to be a manual affair. You will need a process that will generate changes from your development database in order to make the production database mirror the newer design. There are third party products that can help, but how do you differentiate between objects that are meant to go out with one version and objects that are meant to go out with a different version of your code? From within a development database this is frequently impossible since there is no clear and easy methods for differentiating object changes within that database.

Once you start generating your deployment scripts from source control, a number of opportunities open up. You'll be able to differentiate the database objects into known versions which will allow you to control what is getting deployed. Once you can control what is being deployed, you can bring automation to bear on the deployment process. You'll be able to take advantage of continuous integration and other automated deployment and testing mechanisms that application code already uses. Automated deployments also means more testing and validation of those deployments which can help to ensure the final deployment to production is successful.

Conclusion

Since the SQL that defines a database is code, it just makes sense to take advantage of the existing and long-established mechanisms for managing that code. You'll have a better way to backup that code, and, more importantly, retrieve previous versions of that code. Source control for the database provides an audit trail to help with troubleshooting and legal compliance. You'll get better integration with your application code through shared source management. Finally you'll be able to automate your deployments. All these reasons makes putting databases into source control a smart move to help improve management of systems within your organization.

If you'd like to learn more about database source control, Red Gate has put together a free whitepaper "5 Common Barriers to Database Source Control, and How You Can Get Around Them".

Download the whitepaper
 

This article is part of our Database Lifecycle Management patterns & practices.

Return to the library to find more articles on DLM, or visit red-gate.com/dlm for more information about SQL Server products for database lifecycle management.

Grant Fritchey

Author profile:

Grant Fritchey is a SQL Server MVP with over 20 years’ experience in IT including time spent in support and development. Grant has worked with SQL Server since version 6.0 back in 1995. He has developed in VB, VB.Net, C# and Java. Grant has authored books for Apress and Simple-Talk, and joined Red Gate as a Product Evangelist in January 2011. Find Grant on Twitter @GFritchey or on his blog.

Search for other articles by Grant Fritchey

Rate this article:   Avg rating: from a total of 10 votes.


Poor

OK

Good

Great

Must read
Have Your Say
Do you have an opinion on this article? Then add your comment below:
You must be logged in to post to this forum

Click here to log in.


Subject: One other consideration for auditing
Posted by: paschott (view profile)
Posted on: Tuesday, July 15, 2014 at 12:12 PM
Message: DDL triggers work really well for auditing and can also help catch if someone makes a change to an object outside of VCS. That's saved me some trouble in the past if we did have a change made for any reason that wasn't checked in. I know that they work in SQL Server for 2005+, but not sure how or if this is implemented in other DBMS platforms.

Great article on why to get your DB in source control and I can only hope people take this to heart and do it.

 
Simple-Talk Database Delivery

DLM
Patterns & Practices Library

Visit our patterns and practices library to learn more about database lifecycle management.

Find out how to automate the process of building, testing and deploying your database changes to reduce risk and make rapid releases possible.

Get started

Phil Factor
A Start with Automating Database Configuration Management

For a number of reasons, it pays to have the up-to-date source of all the databases and servers that you're... Read more...

 View the blog

Top Rated

Painless Refactoring of SQL Server Database Objects
 Refactoring a database object can often cause unexpected behavior in the code that accesses that... Read more...

Identifying and Solving Index Scan Problems
 When you're developing database applications, it pays to check for index scans in the SQL Server query... Read more...

A Start with Automating Database Configuration Management
 For a number of reasons, it pays to have the up-to-date source of all the databases and servers that... Read more...

Archiving Hierarchical, Deleted Transactions Using XML
 When you delete a business transaction from the database, there are times when you might want to keep a... Read more...

The Mindset of the Enterprise DBA: Harnessing the Power of Automation
 After you have done the necessary groundwork of standardizing and centralizing your database... Read more...

Most Viewed

Beginning SQL Server 2005 Reporting Services Part 1
 Steve Joubert begins an in-depth tour of SQL Server 2005 Reporting Services with a step-by-step guide... Read more...

Ten Common Database Design Mistakes
 If database design is done right, then the development, deployment and subsequent performance in... Read more...

SQL Server Index Basics
 Given the fundamental importance of indexes in databases, it always comes as a surprise how often the... Read more...

Concatenating Row Values in Transact-SQL
 It is an interesting problem in Transact SQL, for which there are a number of solutions and... Read more...

Reading and Writing Files in SQL Server using T-SQL
 SQL Server provides several "standard" techniques by which to read and write to files but, just... Read more...

Why Join

Over 400,000 Microsoft professionals subscribe to the Simple-Talk technical journal. Join today, it's fast, simple, free and secure.