Click here to monitor SSC
  • Av rating:
  • Total votes: 14
  • Total comments: 0
Andrew Clarke

SQL Source Control: The Development Story

22 July 2010

Often, there is a huge difference between software being easy to use, and easy to develop. When your pilot users tell you "by week three of any development project you've thrown out anything that takes time or just gets in your way", then it is time to be serious about usability, as well as quality. We decided to find out from some of those at the sharp end of the development of Red Gate's SQL Source Control.

Before SQL Source Control was released, we chatted to Stephanie Herr, the project manager for SQL Source Control, and David Simner, one of the developers, about all the work that had gone into bringing the product to reality. It was a fascinating, frank, discussion about making a software tool as simple as possible to use, the pursuit of quality, and the technical challenges that an ambitious project like this is likely to encounter and have to overcome. This is the first of two articles based on the interview

Interfacing with Management Studio. ‘The World of Pain’.

What was the most difficult part of writing SQL Source Control would you say?
Plugging into Management Studio was a great effort, but it had to be done. Management Studio is an application we've plugged into before for SQL Prompt, and also for SQL Refactor, so we knew how to do quite a lot of it, but what we didn't know how to do was the green indicators for the databases in the source control and the blue indicators that show that there is a change to this object. So the problem was in working out how to change the icons.

Also we fixed a lot of the tree, so when you do a ‘get’ through SQL Source Control we update the Object Explorer tree to show those objects that you've just got. And if you're in a script window and you do a CREATE TABLE SQL Script and press F5, then, without SQL Source Control installed, it won't show up in the Object Explorer unless you refresh it by hand, whereas with SQL Source Control installed, because we're polling the SQL Server we spot the change that this table has been created.

So we automatically update the Object Explorer for you to show that the creation of the table has happened. So plugging into the Object Explorer has been very difficult because it’s not a defined interface. It changes between different versions we support, 2005 RTM right through to SP3, 2008 RTM, SP1 and 2008 R2, so we support seven different versions. Lots of it is very ‘internal’ stuff.
So you probably know more about interfacing with SSMS than Microsoft does?
That particular bit, yes. And Reflector Pro has been awesome in letting us do this, but even with Reflector Pro letting us step through Microsoft's code it still took several months to get it right. We also had the problem with polling the database for changes. When we first did it, users commented that it had really bad performance so we optimised the scripts until it worked really fast. We started by using the default trace to track everything but soon moved over to a hybrid approach that used the System Views as well. It's gone through several different time-consuming design iterations and took several months to get perfect.
Taking the blue indicators as an example: even after getting it to work in all versions of SSMS, we had to optimise it. Then in testing and early-access usage, a whole lot of other issues came up, such as the complications that are introduced with the different ways that you can connect to a server and, with having multiple connections to the same server in the same SSMS session.
And then you have to test for a whole range of usage in SSMS: such as connecting to snapshots, to Analysis Services and Reporting Services: Because we’re hooking into the Object Explorer tree we have to make sure that we don't inadvertently break these other uses of SSMS.
Microsoft actually says ‘We don't support add ins to Management Studio’. This is quite unlike Visual Studio where third-party suppliers are actually encouraged to build add-ins. Management Studio is supposed to be ‘no add-ins’. Our initial usability studies demonstrated strongly that it was really important to be completely integrated into Management Studio. SSMS is, after all, where database developers are doing their work. They don’t want to be using two separate tools and they don't want an extra process or an extra step that isn’t already in their current way of working. If they were obliged to do so, then they wouldn't do it then because it wouldn’t be streamlined enough if it's not completely integrated.
There's a great quote from one of our users which is ‘ by week three of any development project, you've thrown out anything that takes time or just gets in your way’, so therefore it is really important for us not to get in people's way so being integrated into Management Studio was important; so that you can commit everything to Source Control in about three clicks and just not have to worry about it.
Not having to leave your working environment, not having to script anything into a file, not having to open it from a file to work on it, you just keep working on your database.
You've got to support the way that the database developer works at the moment.
Because otherwise they just won't find the tool useful and they’ll just stop using it.
They'll find ways of subverting Source Control, even if they are required to in their job to do it. Believe me, it's happened to me many times. Developers have good reasons to work in a particular way, and you may think it’s mystical but they find it really hard to have to do anything different.
We talked about how technically complex it is to integrate it into the Object Explorer and how people were saying things like ‘Don't do it! You just can't do it, it's not possible! So we actually had a challenge last summer that we opened up to all of Red Gate and we said ’This is what we need to do and if you can do this you'll win £500!’. This was the thing that we were told is technically impossible, and by the end of the same day we had two viable solutions.

One solution started with the assumption that this was just a static Windows tree view, and it was possible to change icons in tree views. The problem with that solution is that SSMS can refresh the tree view at any point. A second solution was slightly different. This involved changing Management Studio’s internal model. Management Studio knows what icon needs to go in every single place and then it basically copies that to the tree view, which then gets shown on the screen. The first solution involved changing the tree view directly and the second one was changing Management Studio's model which means that you get away from a lot of the refreshing issues and so forth. The solution we actually chose to implement was one level deeper than that, so whenever anything in that Object Explorer gets built we are the people who are responsible for building it, so therefore, whenever it is refreshing it’s us that is doing it. Most of the time, we just defer to SSMS and get SSMS to do it. We then do a second pass through the tree after that, just to update all the icons and then return control back to SSMS.

Because we are responsible for building the actual tree it's a really nice way of plugging in.

It avoids a lot of potential complications because SSMS calls our code to build the tree and we build the tree, we just happen to change the icons at the same time. If we make it sound simple, believe me it wasn’t.
Well, there must be a good reason why nobody else has actually done it.

Designing SQL Source Control

We've had a UX designer from the beginning of the project. He has done incredible designs for us. We’ve then taken all those designs we did and used them in paper prototype usability sessions with real potential users. As soon as we had something working in the product, we've had usability sessions with about five to six users, two users from TFS, two users from SubVersion, and two users from Visual SourceSafe as well. We have had them use the tool over a remote desktop connection, seeing what areas of the tool we’re struggling with and constantly improving the design and the user's experience with it.
How similar is it to the early sketches?
We did a separate technical prototype that was designed to prove the requirement that it should be integrated into Management Studio. That was a big learning factor because people just said they wouldn't buy it if it was outside SSMS, and if the tool required them to change the way they worked.

But the original designs contained the essential idea of the three tabs (Commit Changes, Get Latest, Setup) that we still have within Management Studio. This means that the current design is actually very similar to the early UX prototypes if you look back to them.

But there have been so many minor tweaks. Designs tend to be focussed on the straight path through, but there are plenty of edge cases that weren't thought about in the original designs, but which soon emerge in the extensive tests with real users. When you look at the original designs you’ll see that there were some major features that we had to cut out, but we plan to add these in later versions.

For instance, we really want to help people link their database to SQL Source Control. Currently it's just a text field and we try to populate it with your existing Tortoise SubVersion connections or your Team Foundation Server registered servers. The initial designs actually allow you to browse your repository, being able to create folders directly from within SQL Source Control but again, because of time, these nice-to-haves must wait. After all, the most important thing was to enable you to link your database easily to SQL Source Control.

It is much better to get feedback from our users to allow us to make that judgement as to which areas we should focus on. In the meantime we are concentrating on making sure that there won’t be any showstopper issues where a user is completely blocked from using the tool.

It would be great to browse the repository, but it’s a onetime setup thing for a database, so we had to prioritise that a little bit lower, even though it's really nice to have, and something that we hope to put in soon.

The pursuit of perfection?

Was there anything that you actually wanted to put in Version 1.0 that you've had to put to one side for the time being?
We’d love this product to be perfect in every way, so, obviously, there's a very long list.
(to David) I think I know what you're going to say.
There are bugs which we ‘futured’ which we should…
‘Futured’! That's a lovely way of dealing with bugs!
Yes, but there has to be a compromise. We don't really handle branching at the moment. You won't notice this to start with, but once you've been using source control for a while one of the things that, typically, you want to do is to have branches. We don't really support that at the moment, but it’ll be there soon.
Yeah, that’s a shame. Was there a technical difficulty there, or was it just an issue of time and complexity?
Just time. We have a workaround, so if users want to do it, then they can; but it isn’t elegant. The other feature is ‘red indicators’. You get the blue indicators to show you local changes, but you don't get to see remote changes in the Object Explorer tree. At the moment, you have no way of knowing that your team mate has just changed something without him telling you. We actually have that information and it would be nice to display it without you having to go to the ‘Get Latest’ tab to see it. On the ‘Get Latest’ tab we show you what objects have changed that you haven't got yet.

So one of the ideas for improving this was to poll the source control system and see if anything has been checked in, as well as polling the database for blue indicators. Then you can have another indicator in the Object Explorer tree to show you the remote changes that other people have made.
Yes. But presumably it is quite rare for database developers to both be working on the same schema within a database remotely? I have never come across the practice myself.
Yes, but this is one of the things that helps to prevent you from doing that, so if you know that your colleague has changed something, you’ll avoid changing objects in the same schema.
Can't you merge the two changes in SubVersion?
If different database developers are working on their own schemas or logical areas of the database, then these conflicts aren’t normal, but if they do happen we currently will identify these conflicts.

This conflict will occur when another database developer committed something and you make a change to that same object, but you didn't get the latest version that he committed. So in that case there’s a conflict and you only really have two options when you’re in SQL Source Control right now, and those two options are either keep your version, so keep the changes you made and overwrite what's in the source control, or take what's in the source control and overwrite your database and then you'd have to reapply your changes. One of the nice things we have is the ability to view the differences at the bottom of the commit or get latest tab, so you can see exactly where the conflicts are.

So for simple changes, let's say somebody added another column to a table, but you made a whole bunch of changes to that table, you might want to commit yours and then go back and re-add the column, or you could actually just update your database to contain the name of that column that is in the source control system. That way we don't think it's a conflict anymore because you already have that field that somebody else committed to source control.
Can't you just cut and paste from their saved version In the diff panel?
Sure, you can copy out, but then you have to go into your table designer or query window to actually execute the DDL to do the edit.
Of course, yes.
So you can't apply the edit within the SQL Source Control tab currently. But if there is a really complex merge, you could do that outside of the tool using SubVersion’s merge or a Team Foundation Server's merge feature. You'd have to be really careful because it is so easy at that point to produce invalid SQL syntax. We’re not going to be able to parse that and then you could run into other issues.
Ah, is the problem there because a general-purpose merge feature doesn't know anything about SQL syntax?
SQL Source Control is pretty good at giving you an error if it fails to parse the SQL, so it won't fail silently, you will at least know it's gone wrong, it will give you the filename that's gone wrong, it will say the line that has gone wrong so you can just go in to fix it yourself afterwards, but it's generally not ideal.
How do you get over the difficulty that it's only a pseudo-compilation? When you alter a stored procedure it doesn't actually compile it until runtime. This can cause problems if, for example, a table it's accessing has changed in the meantime. It won't flag that until the first time it runs, so it won't tell you that there is an error in the stored procedure even though it’s been successfully checked in to source control.
Yes. We deal with that by tracking dependencies. So let's say you've made a change to a stored procedure to reference a new table, so when you go to the commit tab and you want to check that stored procedure into source control, we will use SQL Compare’s dependency parsing feature. We’ll look at that stored procedure and tell you that it depends on this table and recommend that you check that table in too. So the idea is that you are encouraged to check the lot into source control and then when someone gets the stored procedure we do the same check and say you probably want to be get that table from source control as well, because the stored procedure relies on this table to compile and run.

One of the hardest parts of applying source control to database work is that source control is generic, and deals with text. SQL Source Control understands that we are dealing with SQL and it's not just a text file. This enables us to prevent an awful lot of that type of conflict by tracking dependencies properly, By default the check box to do this is ticked, but if you want to un-tick it, you can.
Yes, I think most people will leave that option checked by default.

When you are committing your changes you want to commit all the changes together as one logical unit. In your example, if the stored procedure relies on the table they should be checked in together so that the user can provide a comment at the same time about why they made these changes. Was it done in order to fix an issue? Was it to add a new feature?

The design of the application lets us be pretty confident that what is in source control will provide a build that works. If you're changing your stored procedure, it is important that you make sure you know all the things it's referencing,  and that’s helped by one of our other tools, SQL Search, which is a great tool that has also been added into Management Studio. And it's free!

One of the new features in SQL Prompt 5.0  is all about finding invalid objects. By this, we mean objects that were valid at one point but then, because of re-factoring the database, changing the database, or updating it, it no longer references the right things. We're hoping, with tools like this,  to try to help developers to catch that sooner in development than when, in the test environment, you’re executing the stored procedure and it fails to execute!
One of the traditional complaints database developers have is that source control systems don’t allow sandboxing. Database developers will want to try things out with an existing dataset, without restrictions, and without any wish to save the results for posterity.

Imagine that you have your dev database and you're trying out all sorts of wild and wacky things. You want to just be able to say ‘look, don't record this, just ignore this’. Can you do that at the moment? Or if not can you envisage how you will accommodate this feature?
If you are trying things out, then you can make all the changes you want, but not commit these changes to source control. Then you can undo these changes to get back to a good state. It would only become an issue if you're trying to work on two things in parallel. If, for example, you’re trying something out for one thing that you’re working on, whilst at the same time working on something else, then you are going to be coming to our commit tab to check the something else in, but you don't want to check the other bit of work that you’re working on, and in that case, you would need to uncheck the changes you're trying out so that they don't get commited.

The commit tab will list all of the objects, there's no way of specifying ‘ignore all of this for now and select the other objects by default’. But if you're only working on one thing at a time then you won't be coming to the commit tab until you've finished and at the point at which you finish then all those temporary objects can just go away.
I think that will fit in with the way that most people work. Often, they’re attempting to re-factor something and trying various solutions in order to see which one works fastest or takes less IO and then they want to use that.
Well yeah, this is one of the things we've heard back from users about. They are suggesting that our commit tab could do with some kind of filtering or grouping or excluding or whatever.
Well, the obvious way to do it is to separate the database into logical development-areas by schema.
At the top of the commit tab is a list of all the objects you've changed in your database and need to commit to SQL Source Control.

And we list the type of change, the name of the object, the type of the object, and what schema it’s in. All those columns are sortable, so that, if you click on the column, it can arrange the order by schema. There’s a tick box at the top of the table that unselects everything, and then you can highlight just those things in that schema that you want to preserve, and check them in one go. In this way, you can limit what you are committing but again, as David said, it's not remembering this selection for the next time. This means that, if you are only working within a particular schema, then every time you go to that commit tab you’d have to sort by that schema and then unselect everything. It isn’t as slick as we’d like but it you can still do it.
My feeling is that is probably enough until you've got feedback from the way people are actually using it.
An advantage with this approach is that it encourages you to check everything in. I’m like most developers, in that I assume that the source control server is fully backed up but the developer’s desktop is not. Therefore, I want to be encouraged to check everything in because once it is checked in, it gets backed up. If my machine dies then it's all fine. There are lots of good reasons for encouraging people to commit everything, but I’d agree that it's not ideal for everybody.
It isn’t ideal, but we need to be clear about the ramifications before we provide the capability to filter that commit list. Maybe if your tables don't change a lot and it's usually your stored procedures that you're working on, then you’d probably just want to source control your stored procedures, so maybe we should just display certain objects on the commit list. We also have talked about putting in other filtering devices to allow you to exclude all those objects that start with ‘tmp’ and therefore that way those objects would never show up in your commit list so you don't have to worry about unselecting them. There's some applications that create temporary tables and those aren't things that you want to source control, you want to source control your main tables that you're working on.

So it's definitely something we want to provide. Version 1.0 is just focused on robust, reliable, source control of your schema objects. We want to make sure that the process of getting them into source control and out again is absolutely right. We also provide the facility to undo your changes in Version 1.0. But there's a lot more that we want to do and we have plans to continue on after Version 1.0.
It is better to wait until you've assimilated all the feedback from the people using Version 1.0, because before then you can make lots of guesses as to the way people work, and they won’t all be right.

In the next article, David and Stephanie go into more detail about the ways that SQL Source Control can accommodate different methods of database development work. They explain the ramifications of SQL Source Control's support for different database development methodologies such as the use of local, or shared, databases.  The next article goes into more detail about the product's support for SubVersion, and how it can fit in with existing configurations of Subversion.

Andrew Clarke

Author profile:

Andrew Clarke has been developing software applications for over 35 years. He is a database expert, but also has had many years of experience in designing and programming applications and software tools. He has a particular interest in website-publishing and ECommerce. Before joining Redgate, he worked with the first two winners of the 'Apprentice' program at Amstrad, creating various business applications, He is the editor of Simple-Talk.

Search for other articles by Andrew Clarke

Rate this article:   Avg rating: from a total of 14 votes.





Must read
Have Your Say
Do you have an opinion on this article? Then add your comment below:
You must be logged in to post to this forum

Click here to log in.
Simple-Talk Database Delivery

Patterns & Practices Library

Visit our patterns and practices library to learn more about database lifecycle management.

Find out how to automate the process of building, testing and deploying your database changes to reduce risk and make rapid releases possible.

Get started

Phil Factor
How to Build and Deploy a Database from Object-Level Source in a VCS

It is easy for someone who is developing a database to shrug and say 'if only my budget would extend to buying fancy... Read more...

 View the blog

Top Rated

Clone, Sweet Clone: Database Provisioning Made Easy?
 One of the difficulties of designing a completely different type of development tool such as SQL Clone... Read more...

Database Lifecycle Management: Deployment and Release
 So often, the unexpected delays in delivering database code are more likely to happen after the... Read more...

SQL Server Security Audit Basics
 SQL Server Server Audit has grown in functionality over the years but it can be tricky to maintain and... Read more...

The PoSh DBA: Assigning Data to Variables Via PowerShell Common Parameters
 Sometimes, it is the small improvements in a language that can make a real difference. PowerShell is... Read more...

Issue Tracking for Databases
 Any database development project will be hard to manage without a system for reporting bugs in the... Read more...

Most Viewed

Beginning SQL Server 2005 Reporting Services Part 1
 Steve Joubert begins an in-depth tour of SQL Server 2005 Reporting Services with a step-by-step guide... Read more...

Ten Common Database Design Mistakes
 If database design is done right, then the development, deployment and subsequent performance in... Read more...

Temporary Tables in SQL Server
 Temporary tables are used by every DB developer, but they're not likely to be too adventurous with... Read more...

Concatenating Row Values in Transact-SQL
 It is an interesting problem in Transact SQL, for which there are a number of solutions and... Read more...

SQL Server Index Basics
 Given the fundamental importance of indexes in databases, it always comes as a surprise how often the... Read more...

Why Join

Over 400,000 Microsoft professionals subscribe to the Simple-Talk technical journal. Join today, it's fast, simple, free and secure.