As part of our long-running series of articles where we ask working database developers how database source control improves their work within development teams, we made the mistake of asking Hugh Bin-Haad, Database dev and relational theorist.
So you want me to use source control for my database? OK, let's consider what you're proposing to do by putting my database code in source control. We have a multi-user system, and we're taking the code from it, and putting it into another system so that several developers can work on the code independently? What part of the term 'multi-user' don't you understand?
Any developer who's entitled to work on the development database can do so. We don't have to build and integrate the database either in order to test it, because it is already built. It is even cleverer than source control because we can give permissions on the individual schemas or even objects to control the access of the individual developers. We can even specify the type of permission
Oh. So I shouldn't be using a 'shared' database for development? Each developer should use their own development database, put their code in source control and then build a database from source control? So, instead of building a multi-user system in a multi-user way, we should pretend it isn't a multi-user system? If we do that , we then have to devise a way of resolving the differences when the different contributions are merged.
Ah, so if the database is in source control, then people working on the same bit of code can have their code merged? Wow. So if two people simultaneously add a column to a table, or a parameter to a stored procedure, they both get put in? Hmm. Do you believe that? Well, I don't, since a VCS doesn't even attempt to detect a semantic difference, just a textual difference. It can’t even tell whether a change is to code or to a comment! Are you sure that it is better to attempt a merge than simply bin the work of one of the developers? In SQL Server, if two people work on the same table and both of them insert a column using ALTER TABLE, , then by the miracle of ACID, both columns get inserted. Why are two developers working on the same schema at once, anyway? The schema is there to allow a partitioning of large databases, which means databases large enough to require more than one person working on them, by logical areas of functionality. On any sensible SQL Server development system, all changes are recorded, anyway.
So, source code will be lost if we allow developers to work together on a development database? Nope. Code won't get trashed unless we behave like lunatics. If the worst happens, then it is still there. It is a database after all and we take scheduled backups of it just like any other database. What's that? You really think your VCS backups are as reliable as database backups? I thought not. So why export the code from a database into a "database" that isn't as good?
Ah, we don't know who made what alteration to the database unless we use source control? Wrong again. All my development servers have server triggers that record all changes and save them to a table in
master. It even contains the DDL, so I always know what changes were made, to which database, when and by whom. Almost always, I'll know what code was there before because my trigger will have recorded it when the code changed previously. Don't believe me? Here's the trigger…
Create TRIGGER [RecordDatabaseChanges]
ON all server FOR DDL_DATABASE_LEVEL_EVENTS
SET NOCOUNT ON
IF OBJECT_ID( N'master.dbo.ChangeLog', N'U' ) IS NULL
BEGIN --if it isn't there then create it.
create table master.dbo.ChangeLog
Changelog_ID int identity primary key,
[Event Type] nvarchar(30) not null,
[When] datetime not null default getdate(),
[Database] nvarchar(128) not null,
[UserName] nvarchar(128) not null,
[SPID] nvarchar(128) not null,
[Schema] nvarchar(128) not null,
[Name] nvarchar(128) not null,
[New Name] nvarchar(128) not null,
[ObjectType] nvarchar(128) not null,
[New ObjectType] nvarchar(128) not null,
[SQL Used] nvarchar(max) not null,
NTUserName nvarchar(128) not NULL default system_user,
HostName nvarchar(128) not null default HOST_NAME(),
SessionLoginName nvarchar(128) not NULL default SUSER_NAME()
DECLARE @data XML;
SET @data = EVENTDATA();
insert into ChangeLog ([Event Type], [Database], UserName, SPID, [Schema], Name, [New Name], ObjectType, [New ObjectType], [SQL Used])
coalesce(@data.value('(/EVENT_INSTANCE/EventType)','nvarchar(30)'),'') as [Event Type],
coalesce(@data.value('(/EVENT_INSTANCE/DatabaseName)','nvarchar(128)'),'') as [Database],
coalesce(@data.value('(/EVENT_INSTANCE/UserName)','nvarchar(128)'),'') as [UserName],
coalesce(@data.value('(/EVENT_INSTANCE/SPID)','nvarchar(128)'),'') as [SPID],
coalesce(@data.value('(/EVENT_INSTANCE/SchemaName)','nvarchar(128)'),'') as [Schema],
coalesce(@data.value('(/EVENT_INSTANCE/ObjectName)','nvarchar(128)'),'') as [Name],
coalesce(@data.value('(/EVENT_INSTANCE/TargetObjectName)','nvarchar(128)'),'') as [New Name],
coalesce(@data.value('(/EVENT_INSTANCE/ObjectType)','nvarchar(128)'),'') as [ObjectType],
coalesce(@data.value('(/EVENT_INSTANCE/TargetObjectType)','nvarchar(128)'),'') as [New ObjectType],
coalesce(@data.value('(/EVENT_INSTANCE/TSQLCommand/CommandText)','nvarchar(max)'),'') as [SQL Used]
ENABLE TRIGGER [RecordDatabaseChanges] ON ALL SERVER
Ah. So we need to save the database build script in source control so that we can be sure that the application is at the same revision level as the application. I can understand the need to do that if there is only one application accessing the database. What if there are two or more? Do we save the database in two source control archives, simultaneously? I think not. What about the typical corporate systems which involve several applications accessing several databases? Into which of the applications' source control archive do the databases get stored?
So you're saying that the only way we can get a good builds is to 'version' the database, and the only way to do that is source control. We need to save the database changes along with the code changes so they can be built and deployed together. Firstly, we don't necessarily have to version the database, but it is far more important to 'version' the individual interfaces with the applications that interact with it. By saying that we need to version the entire database, I think you're really saying you want unfettered access to all the base tables of the database. This is fine for the database of your DVD collection, but how does that fit in with the security requirements of any large organization? Surely a better way is to 'version' the interface with the application. It isn't hard to 'attach a 'version' to a schema using extended properties. You don't need source control to enable you to do this.
You've got to remember that SQL Server is a relational database, and so it is good at keeping track of data, even when the data is the source of the database that you're developing. All the database objects such as tables, procedures and functions, and their source where relevant, are stored in a highly efficient relational way. We always know when they were created or last modified, what their dependencies are, and what attributes they have. A simple SQL statement will tell you a great deal about the object and its context within the database. Don't fight it, use it!
So let's sum up. By developing as a team on a development server, with a realistically-large data set and hardware, we don't have to continually build or integrate the databases because they're already there. We don't lose what we've done because we back up, save our work, and audit every change on the development server. We can even have an automated process to lodge changes in a source control system at the end of every day so as to keep management happy. We can attach version numbers to our work, at schema or database level, script out, or script-in application interfaces to suit the progress of individual applications. We can generate migration scripts between database versions by using SQL Compare between snapshots. We're happy, we're working in a multi-user way and we don't knock heads together. So why are database developers cast as the retrograde luddites of IT development?