Click here to monitor SSC
  • Av rating:
  • Total votes: 41
  • Total comments: 28
András Belokosztolszki

Source Control and Databases

29 May 2007

Andras's article on Source Control shows a lot of the thinking amongst the developers at Red Gate at that time that eventually crystallized into SQL Source Control. It identified several problems that had to be overcome before the task could be done properly. Andras's suggestions for the use of SQL Compare are now built-in to SQL Source Control, but it remains a fascinating explanation of the complications along the way.

Many DBAs must have made and heard the request for a source control mechanism for SQL Server databases. It sounds like a brilliant idea. Once implemented, one could easily go back to a specific database schema version, compare it against the current version, and have plenty of information for auditing, or for reverting a database object to the previous version.

But how does one even start introducing source control for a SQL Server database? Such a system has particular challenges to overcome, depending on which of the following database development techniques you adopt:

  1. Storing the database schema in the database
  2. Storing the database schema as SQL creation statements in files, which when executed, build up the database.

This article compares and contrasts these two development scenarios and then assesses some source control and team development challenges, as they apply to each.

The majority of available source control systems work with files, and consequently do not work with SQL Server databases directly. The database schema can be stored in files for the purpose of source control but then the need to translate schema objects to files introduces the question: is the schema in the database the same as the one represented in the files?

Storing the schema in the database

Microsoft SQL Server DBAs are spoilt for choice in terms of graphical tools that allow the creation of a database, a table, and any other database schema object. Management Studio makes database development accessible to everyone: with a few mouse clicks and some drag-and-drop one can easily create a database, a few tables, and other dependent objects. Figure 1 shows a screenshot of the view designer that aids creating views in a visual way.

When using the GUI to add database objects to the database, no SQL statements are required from the user. These statements are generated by the GUI and discarded after their execution. While this makes the process very simple, it effectively leaves the user with nothing to check into source control. One can use the scripting functionalities of the GUI tools to generate the SQL statements, but it can become a very messy task to identify and replace the scripts for altered database objects.

Once a database schema is produced in this way, it can be compared, using schema comparison tools, with other development databases. All of the differences can be merged into a single staging or reference database. Regular backups would ensure that it is possible to go back to previous versions, and usually offline documentation would keep a more or less accurate history of who has done what and why.

NOTE:An alternative to offline documentation would be to use extended properties on the database objects. However, many of the graphical tools lose such extended properties during object alter.

Source control is possible using generated creation scripts, or third party tools that can create a snapshot of the schema, but this might not very granular.

In this development approach the database schema evolves in the database. Unfortunately, if not done carefully, it may evolve into something horrible. It is likely that it will contain "stale" objects, stored procedures whose purpose no one knows, stored procedures that do the same thing as ten other stored procedures. But since any object in such a database is a "public API", heaven knows who is depending on them.

Overall, this development approach allows for rapid prototyping, without significant knowledge of SQL and it's perhaps no surprise that Management Studio is a popular "database design/development" tool for many. The approach works well with Agile development and, with proper unit testing, refactorings and self discipline, it can prove a very powerful means for development. It is, however, a very different approach from an almost idealised methodology, in which a database schema is properly designed and, after the design is final, the applications are built on this database design. Many developers find it hard to reconcile this "ideal methodology" with the need to accommodate seemingly ever-changing requirements.

Storing the schema in creation SQL files

The other popular database development technique stores the schema in creation scripts. Generally, for every database schema object there is a file that contains the SQL commands that will create that object. These files are the basis for development and for source control.

The development process is similar to the approach taken when using the database as home for the schema. The important difference is that one does not make changes by executing SQL on a live database instance, but instead updates a set of SQL scripts. For every database schema modification the creation SQL files are modified and, once they are deemed ready for deployment, they are executed to create a reference database. This database can be compared against a live database, and then the migration/deployment stage starts.

The main reason for the popularity of this approach is that each creation SQL file can be stored in a source control system. In this way the history of every modification to the database schema is available. Using the locking mechanism of source control, large development teams can work on the same database and be able to work in parallel. Therefore, this approach to database development is more popular with larger organizations where development is performed by a larger team who need to cooperate.

Source control and team development challenges

Both of the above approaches to database development are popular, although they are generally used in different types of organizations. The fact that there are at least two approaches introduces some problems: sometimes these two approaches are mixed. It is obviously faster to fix a database performance problem by changing an index on the live database, even though the schema might be stored in creation SQL files. Of course it would be better to make the change on the script side first. But is this true if this would have meant an hour of extra time at 2 am?

Personally, I prefer to store the database schema in creation SQL files, but it is undoubted that both of these approaches have their shortcomings with regard to ease of use, tracking changes in source control and development as a team.

One of the challenges we need to face is the fact that the source control system stores files, but what we want to see in the source control is the not the files but the database schema itself. In SQL Server, the schema is a set of objects that may have different textual representations. If one uses creation SQL files it is therefore difficult to see if these files actually represent the schema in the database. Also, if one looks at the database schema, there are many different ways to generate the file that can be stored in the source control system.

Textual and non-textual database objects

Textual database objects, such as stored procedures, views and functions, are stored by SQL Server as meta information and as text. For example, consider the following function:

/* addTwo function, 2007-05-15 */
CREATE FUNCTION addTwo ( @a INT, @b INT )
RETURNS INT
AS BEGIN
    RETURN @a + @b – The sum of the two numbers
    END

SQL Server will store the above text with all the whitespaces and comments, in a system table. You can view this definiton by calling sp_helptext, or by querying the system table sys.comments on SQL Server 2000, or the sys.sql_modules system view on SQL Server 2005.

In addition to this textual definition, SQL Server will store meta information about the funtions, such as the parameters it accepts, its name, and the owning schema or user, in other system tables. Some of these are accessible via sys.objects or sysobjects, sys.parameters and syscolumns, etc.

Unfortunately, SQL Server is not perfect, and the more modifications one makes to such textual objects, the more likely it is that the system tables that contain the meta information will get out of sync with the object definiton. This inconsistency in system tables is an unfortunate side effect of using the database as the storage location of the schema. It is not a problem if the schema is stored in scripts, and is created every time – in the right order – without alteration statements.

Nevertheless, both schema storage approaches described in this article benefit from the fact that textual database objects are stored by SQL Server exactly as presented to SQL Server. Such database objects can be retrieved from the database in an exact way, therefore they can be compared with the files that are stored in the source control system. This also works the other way around, if one needs to know if such a textual object in the source control is the same as the database version, one can once again just do a textual comparison.

The advantage of SQL Server storing such textual objects without modification is that formatting and comments are preserved, and can be retrieved from the database schema. These textual objects, particularly stored procedures, functions and views may contain complex business logic that needs to be quickly understood by people who want to make changes to them. One way to aid this process is to format the code in a readable way, follow consistent variable naming and add plenty of comments. Many such aids are even standardised within organizations.

Sadly, efforts to make creation scripts readable and comply with organizational policies cannot always be preserved by SQL Server. Non-textual database objects, for example tables, are stored as meta inforamtion only. Comments and formatting for such objects are not preserved in the database. For example, one may have a script like this:

CREATE TABLE dbo.Employees
    (
      EmployeeID int IDENTITY(1, 1) -- The ID that we use in HR
                     NOT NULL
    , LastName nvarchar(20) NOT NULL
    , FirstName nvarchar(10) NOT NULL
    , Title nvarchar(30) NULL
    )

This code contains an important comment that gives information about the EmployeeID. Also, every column is specified on a new line, with the comma at the beginning of the line. If this statement is executed, the formatting information and the important comment are not stored anywhere in the database schema. If there is need to preserve the above information, one has to store it outside the database. This can become tricky if one uses GUI for certain database modifications. If you add a new column to a table via the GUI then the table creation script will need to be updated. If you script the altered database table via the GUI then, again, it would not contain the comments.

This makes non-textual these objects less ideal for source control systems, because it is not obvious whether the script file in the source control system represents the same object that is in the database schema. If one scripts the database schema object from the database, the textual representation of the object could be significantly different from the script. For the above table SQL Server's Management Studio will create the following:

CREATE TABLE [dbo].[Employees](

      [EmployeeID] [int] IDENTITY(1,1) NOT NULL,

      [LastName] [nvarchar](20) NOT NULL,

      [FirstName] [nvarchar](10) NOT NULL,

      [Title] [nvarchar](30) NULL

) ON [PRIMARY]

Are the two tables the same?

The alternative is to create a toy database, execute the object creation script from the source control system, and use third party tools to compare the two database objects.

Processed database objects

Some objects are processed by SQL Server. In other words, the creation SQL bears no resemblance to what SQL Server stores internally. For example, check constraints and DRI defaults are often completely rewritten internally.

For example, if a DRI default is set to "2", SQL Server 2005 will turn it into "((2))". This is not a major change, but it does differ from the behaviour in SQL Server 2000. Some examples include the following:

Input

SQL Server 2000

SQL Server 2005

2

(2)

((2))

1+2+3

(1+2+3)

(((1)+(2))+3)

cast(3 as int)

(convert(int,3))

(CONVERT([int],(3),0))

This is not a problem when the schema is stored in the database. But if we use scripts, one would expect to validate that what is in the scripts is the same as what is in the database. One will certainly start wondering whether the scripts are in sync with the database if one had a cast statement in the creation script, and a convert statement in the system tables. Some of the rewrites are even more difficult to compare. For example, in SQL Server 2005, an IN statement is usually rewritten as a disjunction, with a lot of extra parentheses.

When one is using creation scripts to store the database schema, it may be worthwhile to create a DRI default or check constraint in SQL Server first, and then copy the processed form of it to the creation scripts. In this case the creation script will resemble the scripts that can be created by GUIs from the database, making it easier to see differences between the database and the creation scripts.

Renaming objects

Sometimes, database objects need to be renamed. A stored procedure may need to be extended to include more functionality, or a stored procedure that is no longer required will need to be replaced with a new one.

Renaming database objects is something that is rather simple to do in creation scripts, since it involves nothing more than a search and replace operation. But with the database as the schema source, there are few things to watch out for. The sp_rename procedure, or the Management Studio object rename, will not do a good job. First of all it will not modify all the system objects. It will update the sys.objects or sysobjects table, but not syscomments or sys.sql_modules.

NOTE: For more details see my blog, A reason to avoid sp_rename.

You should also be aware that if you generate a creation script for a renamed stored procedure using Query Analyzer or sp_helptext, then the generated script will not create the stored procedure that is in the database. It will create one that has the original name.

Furthermore, neither sp_rename nor Management Studio will rename the references to this object. This is one of the shortcomings we sort to overcome with the object renaming feature of SQL Refactor.

When one tries to check whether the source control version of a renamed stored procedure is the same as the one in the database, one may get surprised by the fact that while the textual representation of a stored procedure is the same in both the source control system and live database, the name that the stored procedure can be accessed via is different in the database.

If one is altering schema in the database and is using source control by scripting the database objects, then if you rename an object you will also need to update the corresponding script file. Usually these script files are named after the contained database object, so if a renamed database object is scripted it would be scripted to a file that reflects its new name. It is important to remove the old version of a renamed object by deleting the corresponding old script file.

Constraint names

Foreign keys, DRI defaults, primary keys, indexes and check constraints are parts of a table definition. These are also individual database objects, and can be referenced via a name. This name can be specified by the user when the object is created. For example the following statement will create a foreign key named FK_Employees_Employees:

ALTER TABLE [dbo].[Employees] WITH NOCHECK
      ADD CONSTRAINT [FK_Employees_Employees] FOREIGN KEY ([ReportsTo]
        REFERENCES [dbo].[Employees] ([EmployeeID]) 

However, if the name for the foreign key is not specified, a name will be generated automatically. For example if one executes:

ALTER TABLE [dbo].[Employees] WITH NOCHECK
      ADD FOREIGN KEY ([ReportsTo])
        REFERENCES [dbo].[Employees] ([EmployeeID])

This will create a foreign key with a generated name, such as FK__Employees__Repor__0CBAE877. This name is different every time the script is executed. If one is using creation scripts to store the schema, it is good practice to name such constraints explicitly, and not spend time matching such objects based on their content when exploring differences between the scripts and the live database.

Object dependencies

Stored procedures may depend on other stored procedures; views depend on tables. For tracking dependencies, some people make use of the sysdepends table in SQL Server. This table stores dependency information for some database objects. For example, if a stored procedure A calls stored procedure B, one would expect an entry to reflect this in sysdepends.

In practice, it sometimes is there, but sometimes not. If the stored procedure A has been created before stored procedure B has been created then the entry will be there. However, the sysdepends table can easily get out of sync with the reality. It is enough to alter a stored procedure and the dependencies are not updated. The same applies to the SQL Server 2005 equivalent, sys.sql_dependencies.

Having all the database objects in textual form in a creation script enables one to use text operations like find, search and replace, etc. This renders finding a referencing stored procedure to a simple text find, and one does not need to rely in possibly inconsistent sysdepends tables.

However, if your schema is stored in the database, one way to avoid corrupting the sysdepends table is to not alter textual database objects. For example, instead of altering a stored procedure with an ALTER statement, one can drop the old stored procedure, and create the new one. This will ensure that the sysdepends table is updated. However, dropping a stored procedure and creating it will delete all the extended properties and permissions on that object, so these will need to be set as well. Also, it is important to create all referenced objects first, since when the sysdepends table is modified, the dependent objects must exist.

Often, especially when using schema binding, one must execute the scripts that create the database in a specific order. This is a common problem with source control for databases, because for granularity reasons it is good practice to use a single file for each database object; however, to create the database from the SQL files one needs to create and execute a single script.

One solution I’ve seen in practice was to concatenate all the scripts, execute them, look at the error message, reorder, execute, look at the error message, reorder, execute …. Clearly this is a horrible solution, and of course, it will very likely mess up the sysdepends table.

Luckily there are tools that can read in creation scripts and order them in the right dependency order automatically. This will allow the creation of databases that have a consistent sysdepends table, and will help other DBAs in exploring the database. For more details see How to order creation SQL files.

Tools that help to address these issues

It is not the intent of this article to discuss the nuts and bolts of implementing source control system for SQL Server. However, it is worth noting briefly that there are a couple of tools that will help to address the issues involved in using source control with SQL Server databases.

Editor's note. This list is now out of date, now that a lot of Andras's thinking that is expressed in this article is now embodied in SQL Source Control

Microsoft Visual Studio Team Edition for Database Professionals (MSVSTEDBP) is biased towards the storing schema in creation scripts. Although it is a pricy solution, it contains many powerful means to aid the database development process. In MSVSTEDBP one is working on SQL scripts, and can unit test the database schema, but the tool also imposes restrictions on the source control systems that can be used for the database scripts.

Another tool that will be available at the end of June 2007 is Red Gate’s SQL Compare 6.0. Its primary purpose is to compare and synchronize two database schemas. The major addition to version 6.0 is that a database source now includes a set of creation scripts. These scripts can be created, read in as if they were a database, and one can synchronize towards the database as well as back from a live database to the script files. Also, when SQL Compare reads in scripts from files, it can synchronize partial changes to another data source, like a live database. It will do this by creating a script that is ordered based on the dependencies between the database objects.

Summary

Setting up source control for a SQL Server database can be tricky. There are two main approaches to database development, and both could be adapted to be used with source control. However, neither of these approaches is perfect, and there are issues one must look out for to ensure smooth source control operation. The main challenge is that in order to use source control one needs to use files, but it is not a trivial task to convert the database schema to files that can easily be compared back to the database schema itself. In this article I explored some of the associated problems, such as preserving documentation and formatting for textual and non-textual objects, as well as working with processed database objects like check constraints and inconsistent system tables.

András Belokosztolszki

Author profile:

András Belokosztolszki is a software architect at Red Gate Software Ltd. He is a frequent speaker at many UK user groups and events (VBUG, NxtGen, Developer’s Group, SQLBits). He is primarily interested in database internals and database change management. At Red Gate he has designed and led the development of many database tools that compare database schemata and enable source control for databases (SQL Compare versions 4 to 7), refactor databases (SQL Refactor) and show the history of databases by analyzing the transaction log (SQL Log Rescue). András has a PhD from Cambridge and an MSc and BSc from ELTE, Hungary. He is also a MCSD and MCPD Enterprise. See my blogs on simple-talk.

Search for other articles by András Belokosztolszki

Rate this article:   Avg rating: from a total of 41 votes.


Poor

OK

Good

Great

Must read
Have Your Say
Do you have an opinion on this article? Then add your comment below:
You must be logged in to post to this forum

Click here to log in.


Subject: so
Posted by: Efrain (not signed in)
Posted on: Wednesday, May 30, 2007 at 1:07 PM
Message: So, will SQL Compare 6.0 be a panacea for database source control?

Subject: Good Job
Posted by: Granted (view profile)
Posted on: Wednesday, May 30, 2007 at 1:51 PM
Message: Nice summary of the problem. If you have that well defined the solution you're working on should be pretty good. I can't wait to see it.

Subject: What a Punchline
Posted by: Watchdog (not signed in)
Posted on: Thursday, May 31, 2007 at 1:43 AM
Message: Well done for waiting until the last few paragraphs before tying it up nicely: "Buy Red Gate SQL Compare 6.0". There's nothing like a bit of independent commentary and this was nothing like a bit of........

At least the blog entitled "SQL Compare 6.0 beta - read and sync to and from SQL scripts" is more honest: it just repeats the phrase "SQL Compare 6.0" many times in a short space. Then a bunch of simple-talk hangers-on start a "spontaneous" conversation about the product.

All very disappointing. No credibility.

Subject: Source Control and Databases
Posted by: The Real G'Day! (not signed in)
Posted on: Thursday, May 31, 2007 at 1:43 AM
Message: This is an excellent overview of working with databases in source control. I wonder why you didn't mention the use of a fully automated change control tool, such as DB Ghost?

Subject: Good Job
Posted by: Tom Powell (not signed in)
Posted on: Thursday, May 31, 2007 at 7:59 AM
Message: I was particularly interested since I'm at a client site where we have both Redgate tools and MS TFS. I just completed putting the DB I'm responsible for into TFS and am very impressed with the ability to manage all source control, scripting, and deployment from the Visual Studio interface. I guess the big bucks get you something. It will be interesting to see how your product compares.

BTW, to those complaining because Andras is in favor of his own product, get real! He spent (I assume) several years of his life and plenty of hard work on it -- he should be proud of it. I think he gave a good overview and he even pointed out his competition first. In case you didn't notice the top of this page says "Simple Talk is a service from Red Gate Software". It's not like he tried to keep his interest a secret. Lighten up...

Subject: Open source tool for generating scripts
Posted by: Jesse (not signed in)
Posted on: Thursday, May 31, 2007 at 8:40 AM
Message: I wrote a cmd line tool hosted on codeplex that will generate scripts for all objects in any 2000 or 2005 database using SMO. It generates a separate file for each object, in a directory tree structure similar to Object Explorer in SSMS.

Best of all it's open source, so you can tweak it if you want.

I wrote it just for this purpose: to get all the databases in my current org under source control.

Check it out here: http://www.codeplex.com/scriptdb

hope it's useful to some of you!

Subject: Interesting Article
Posted by: Dave (view profile)
Posted on: Thursday, May 31, 2007 at 8:43 AM
Message: Source Control for databases is an issue we have encountered but not really resolved or even tried to resolve. We are therefore interested in an efficient, effective and relatively easy to use solution. We'll wait and see what SQL Compare 6.0 has to offer as we feel that we might be storing up potential troubles on a variety of fronts by not implementing a solution.

About Andras being in favour of his own product and "Simple Talk is a service from Red Gate Software" that seems fair enough. Until you see "about" at the top of the page it says "Independent, accurate technical content for SQL Server and .NET professionals, all peer-reviewed and professionally edited to the highest quality". Don't think you can be independent and at the same time recomending products from the people who own/run the website who just happen to be your employers. Sorry guys.

Subject: Dave
Posted by: Tom Powell (not signed in)
Posted on: Thursday, May 31, 2007 at 9:24 AM
Message: My point is simply that there doesn't appear to be any attempt to deceive. It's not wrong to be in favor of your own product. Did you feel he was unfairly biased or that he did not provide quality information?

BTW, I am not affiliated with Redgate in any way. As a matter of fact I am a big MS TFS fan, but I think Andras gave a fair and reasonable overview of the issues involved.

I realize the TFS price tag is ridiculous but I think they have a 180-day trial, you might find it worth playing with. I am a recent convert but I'm pretty impressed.

Subject: Source Control
Posted by: Steve (view profile)
Posted on: Thursday, May 31, 2007 at 9:48 AM
Message: ...missed a parenthesis ([ReportsTo]

Nice work on a subject that gets little discussion.

Subject: Punchline, independence etc.
Posted by: Tony Davis (view profile)
Posted on: Thursday, May 31, 2007 at 10:13 AM
Message: The article was an assessment of the problems associated with source control for databases. It ended with a brief mention of the two main products that may solve these issues for you. Yes, one was a Red Gate tool but DB-Pro was cited as well.

As editor of the site I do have editorial independence. I am never told what I should and should not publish. But yes, the site is run by Red Gate and I work for Red Gate so I do take Dave's point.

I want Simple-Talk to be a useful technical resource for the community and I try to make every article I run on the site technically interesting. If you take a look around the site, I think you'll find 90+% of the articles don't mention Red Gate at all. But, yes, there are some that do feature RG tools.

If people think the RG association should be clearer, or we should find a way of more clearly distinguishing articles that feature a RG tool, then I'll definitely consider that feedback seriously.

Cheers,

Tony.



Subject: SQL Server Source Control
Posted by: GSquared (view profile)
Posted on: Thursday, May 31, 2007 at 10:18 AM
Message: I don't have a total solution for database source control, but what I do does seem to work well enough.

When we start a new development project, as simple as a new web page or as complex as a new product line, I create a new virtual server with a copy of all production databases affected by the project.

Changes needed for that project are then made on that server only.

When a project is ready to go to check-in, I run comparisons between that database and the production database, both structure comparisons, and data comparisons that are directly related to the project being checked in. (For example, new columns added to existing tables, will obviously need to be syncronized when the changes go live.)

(I use ApexSQL Diff for the comparisons, not the Red Gate product, but that's because I have a "lifetime free upgrades license" for the Apex product. I haven't used the Red Gate product, but I'm sure it would get the same/similar results.)

Then the database(s) involved is/are compared to other branch databases in the same manner, to make sure there aren't any conflicts between them (wouldn't want project A to go live, and then have one of it's procs overwritten a month later when project B goes live). Conflicts are resolved, and changes can be made to both dev databases as needed.

From all this, a script is created that can be used to make all needed changes to the live database.

A fresh copy is made of the production database(s) affected, and the script is run in those copies to make sure the script runs without error. Tests are done on those copies to make sure that nothing will be broken by running the script, and that all desired results of the script will work. Web pages are tested on that copy of the database, etc.

Once we can run the script on a fresh copy of the production database(s) without error and with everything fully functional and nothing broken, then we schedule a back-up of the production database(s) affected, and run the script immediately after the backup is done.

It sounds complex, but in many cases, simple changes can be made in under an hour, guaranteeing that they don't break anything in the production database(s) and that everything works as desired. No down-time, no "oops, dropping that proc broke three live pages", no "what do you mean that column was still in use till we dropped it and lost all the data". It's pretty smooth.

It allows us to keep multiple projects going without having to worry that projects X and Y both need changes to the database, and that said changes might be conflicting.

Not perfect, mainly because it involves a lot of overhead for tiny projects sometimes, but it works really well for the major releases (new product lines, etc.).

Subject: Addendum
Posted by: GSquared (view profile)
Posted on: Thursday, May 31, 2007 at 10:25 AM
Message: It also helps that the largest database I have to deal with is less than 200 Gig, so having multiple copies in multiple virtual servers isn't that big a deal. With multi-terrabyte database, I'm not so sure this would be a workable solution. Might have to work with structurally identical dev databases with most of the actual data removed, which would be slightly less certain of errors, but wouldn't require a server farm for each dev project.

Subject: GSquared Comments
Posted by: Tom Powell (not signed in)
Posted on: Thursday, May 31, 2007 at 10:40 AM
Message: Do you use a source control system in your process? If so, exactly what is stored? Or do you keep a copy of the virtual as a "version"?

Sorry, I don't mean to be dense but it happens sometimes...

Subject: No Big Bucks, No Data Dude
Posted by: Randy In Marin (view profile)
Posted on: Thursday, May 31, 2007 at 1:41 PM
Message: We can't afford VSTS DB Pro. We do want to keep the database schema in source control. However an object like a table with both structure and data is a little more difficult to handle. Migrations include both alter and update scripts. Is the point to just keep the structure in source control and use a compare tool to help with migration? Is there value in storing the alter and update scripts from each migration in source control?

Subject: Source Control Branching
Posted by: Chuck (view profile)
Posted on: Thursday, May 31, 2007 at 2:22 PM
Message: one of the biggest problems we have is that we are forced to use Visual Source Safe for our source control software. Effective and safe branching is the biggest challenge we face. How to branch, how to roll back to a specific build, make fixes, then apply those fixes to the current development and test builds. Very challenging. If anyone out there is doing this well with VSS, I'd like to hear how you are doing it.

Also, what about the issue of data? How do you roll back to a specific build of the database if the data structures have changed, and the current data conforms to the current structure and not the old structure?

Thanks

Subject: Jesse's scriptdb tool
Posted by: Travis (not signed in)
Posted on: Thursday, May 31, 2007 at 3:19 PM
Message: Database modification/evolution has definitely been a challenge. For me, it is like rebuilding an entire house, foundation and all, with the people still living in it. Luckly, the people are very cooperative ;)

I appreciate seeing Jesse's comments on this list and I have to vouch for his tool... the issues that are addressed in the article have been well meet in my development approach by starting with Jesse's tool, scriptdb [http://www.codeplex.com/scriptdb], and importing the data into Subversion [http://subversion.tigris.org]. Then, modifying Jesse's setup to our needs has allowed a script-based development cycle of alter/deploy that is both agile and acts more along the lines of compilation than db modification to manage the database. It allows forward and back rolling of the database for testing and deployment. It is working well for us. And, is all open-source.

Essentially, Jesse's solution allows a total backup of a database into script form, including data if desired, which we use as a baseline build, then I have a parallel development build that is tested against the baseline. Since we are using Subversion, any number of people can be working in parallel on the database evolution and all the changes are merged together for a final modification deployment. After a check is done against the latest backup of the live database, and any bugs handled, the dev scripts are executed without data restoration, and all the modifications morph the database into the shape we are aiming for, with the data still intact, and the database only out if use during the time it takes to execute all of the scripts, which in our case is under 5 min.

Once that is complete a new baseline is created from the current modified production database, and we go from there. Lather, rinse, repeat. Since all of the change scripts are stored in Subversion, we have baseline tags and modification tags for every step of the evolution of the database from the beginning of modification. So the database can be restored from any point in time, to any form desired since the inception of modification under this system.


So, I have to say, thanks again Jesse for the scripting tool, it was a great starting point to build the tools that we needed to modify the database the way we hoped, and has allowed me to transform a database that was an inherited, disorganized wreck into something that is really supporting the business needs at the organization.

Great work on that software Jesse, and the people on the Subversion team.

Subject: Re: Generating scripts
Posted by: Phil Factor (view profile)
Posted on: Friday, June 01, 2007 at 4:03 AM
Message: We have already, some months ago, published a TSQL script here on Simple Talk that will script out a database either as individual objects or as a build script. There are quite a large number of options which are built into DMO to do this. It is here at More Database Administration and Development Automation using DMO and the stored procedure is spScriptObjects. There is a text version here (spScriptObjects). I'm sure most DBAs have something similar, but, combined with SubVersion, it provides quite a neat system. I'm not sure what this has to do with the problem Andras is describing though; as Andras is, I'm sure, assuming that you already have such a tool.
The reason I'm so interested in Andras's line of thought is the idea that one can compare the structural differences between builds of the database rather than, with SubVersion, merely the code itslf. The idea of preserving comments blocks, especially in views and tables (which Andras is is still working on) is very interesting.

Subject: Wow, what took you guys so long?
Posted by: Jim McCary (not signed in)
Posted on: Saturday, June 02, 2007 at 2:44 PM
Message: Yawn, we've been using DB Ghost (www.dbghost.com) for about four years now to manage all SQL changes via source control and it handles the job perfectly fine. I thought Microsoft were somewhat late with TFS4DBPro (whilst being fantastically expensive!) but Red Gate are clearly playing catch up. I'll have a look at SQL Compare 6 but I doubt it's gonna beat what I already have...

Subject: old rope, and a bit frayed at that
Posted by: FatherJack (not signed in)
Posted on: Monday, June 04, 2007 at 3:18 AM
Message: why have a comment in a create script that describes the use of a column?
...
EmployeeID int IDENTITY(1, 1) -- The ID that we use in HR
...

use the extended properties of the table and the description is then available to you via SSMS, just where you expect it when you need it. Far better than having to run off and get a script from your repository and seeing how the column was intended to be used.

Subject: not complete
Posted by: Fridthjof-G Eriksen (not signed in)
Posted on: Monday, June 04, 2007 at 4:18 AM
Message: I understand that any article discussing source control and databases will never be "complete", but in this instance I believe the article at hand barely touches the very beginning of the issues involved.

Use of a SCM (be it sourcesafe, cvs, subversion, mks, perforce, clearcase, etc, etc) is not simply a matter of storing a bunch of generated ddl's within the SCM Repository.

Consider the use of labeling, automatic builds, continous integration, branching etc, etc and the picture becomes a lot more complex.

In particular, most systems we design and build also contain reference data/configuration in the database. Is a database built using simple auto-generated ddl's to be considered complete? I say not. Second, what do we do we foreign key constraints etc, when one table is created at a later time (or as a part of a patch), and a table already existing from previous revisions need to be altered to reflect the change.

One must also consider build order, upgrades, patches, bug fixes, etc, etc..

Proper use of a SCM tool and thorough procedures for handling database change mgmt is IMHO really important, and many of the sometimes tedious processes involved (generate base DDL, generate build script, generate delta update/patch/bugfix/rollback script, generate any required reference data, any metadata updates) can thankfully be at least partially automated, but it requires more then a single tool, it is also a qustion of process. That is something I keep see missing in these types of discussion.

No SCM is going to fix a faulty process for you.

Note that I do not discuss the merits of datadude, red-gate, db ghost (or for that matter, the abilities of most enterprise class ER modelling tools, who could do this for quite som time already). I've used, bought, recommmended or at least evaluated many a tool, but I keep seeing people thinking a single tool is all you need.

ah well, i guess i could go on and on about this (admittedly) huge subject. it can quickly become a can of worms.

FGE



Subject: Reference Data is important
Posted by: David Atkinson (view profile)
Posted on: Tuesday, June 05, 2007 at 8:39 AM
Message: I would definitely agree that schema isn't everything. Ideally, reference data (lookup/static) should be treated in the same way, and deploy alongside the schema. How to achieve this is the big question. Does anyone have any ideas or suggestions? Red Gate has a tool called SQL Data Compare that will synch data, but this requires the source to be a live database, so doesn't work in a source control model.

Subject: Using build scripts from SQL Compare
Posted by: Ken3 (view profile)
Posted on: Wednesday, June 06, 2007 at 10:23 AM
Message: A trick I use for getting build scripts for placing into Source control is to create it by comparing the database with a blank database using SQL Compare. The advantage is that the SQL Compare build script is atomic, in that it rolls back completely if an error occurs, and I get a progress report of the build. The objects are also in the correct order so I don't get the 'sysdepends' dependency warnings.

Subject: Reply to: Tom Powell
Posted by: GSquared (view profile)
Posted on: Wednesday, June 06, 2007 at 3:59 PM
Message: I don't have a commercial version-control package tracking the individual virtual server files. I suppose I could, but it hasn't been needful yet. I'll have to look into that.

Subject: Strong version control and reliable roll-out
Posted by: Shandy (view profile)
Posted on: Thursday, June 07, 2007 at 7:37 PM
Message: We store the database version as a stored procedure that never gets called, e.g.:

CREATE PROCEDURE [dbo].[p_VER_SchemaVersion_0_1_75_0] AS RETURN 0

We can query the version of the database using the following stored proc:

-------------------------------------------------------
CREATE PROCEDURE [dbo].[p_VER_GetSchemaVersion]
@Version varchar(50)='' OUTPUT
AS
Set @Version = 'Version Error'
if(select count(*) from sysobjects where xtype = 'P' and name like 'p_VER_SchemaVersion_%') = 1
begin
select @Version = Replace(SubString(name,21,50), '_', '.') from sysobjects where xtype = 'P' and name like 'p_VER_SchemaVersion_%'
END
Print @Version
-------------------------------------------------------

We have a tool that can use this information, along with an sql-based script file to safely upgrade a production database, instantly rolling back, should an error occurr during the upgrade (rare). We can always test the upgrade on a recent copy of the production database beforehand. Total system downtime during upgrade: 30 seconds (40 seconds if the upgrade fails).

Subject: Apologies
Posted by: Shandy (view profile)
Posted on: Friday, June 08, 2007 at 12:58 AM
Message: I think I clicked Back then Forward in my browser and it reposted the form text.

Subject: Reply to: Shandy
Posted by: GSquared (view profile)
Posted on: Thursday, June 14, 2007 at 1:13 PM
Message: Why not just have:

create proc p_VER_SchemaVersion
(@Version varchar(50) output)
as
select @version = '0_1_75_0'

Then just update the version number in the body of the proc. That's going to be a lot easier to call than what you have, and doesn't involve anywhere near as many calculations, string manipulations, etc.

Or am I missing something in this?

Subject: Comments
Posted by: Anonymous (not signed in)
Posted on: Wednesday, July 23, 2008 at 7:55 AM
Message: useless

Subject: discussion
Posted by: robertg (view profile)
Posted on: Monday, June 29, 2009 at 4:18 PM
Message: Gentlemen,
Great discussion. It has helped me tremendously. Thanks.

 

Phil Factor
Searching for Strings in SQL Server Databases

Sometimes, you just want to do a search in a SQL Server database as if you were using a search engine like Google.... Read more...

 View the blog

Top Rated

SQL Server XML Questions You Were Too Shy To Ask
 Sometimes, XML seems a bewildering convention that offers solutions to problems that the average... Read more...

Continuous Delivery and the Database
 Continuous Delivery is fairly generally understood to be an effective way of tackling the problems of... Read more...

The SQL Server Sqlio Utility
 If, before deployment, you need to push the limits of your disk subsystem in order to determine whether... Read more...

The PoSh DBA - Reading and Filtering Errors
 DBAs regularly need to keep an eye on the error logs of all their SQL Servers, and the event logs of... Read more...

MySQL Compare: The Manual That Time Forgot, Part 1
 Although SQL Compare, for SQL Server, is one of Red Gate's best-known products, there are also 'sister'... Read more...

Most Viewed

Beginning SQL Server 2005 Reporting Services Part 1
 Steve Joubert begins an in-depth tour of SQL Server 2005 Reporting Services with a step-by-step guide... Read more...

Ten Common Database Design Mistakes
 If database design is done right, then the development, deployment and subsequent performance in... Read more...

SQL Server Index Basics
 Given the fundamental importance of indexes in databases, it always comes as a surprise how often the... Read more...

Reading and Writing Files in SQL Server using T-SQL
 SQL Server provides several "standard" techniques by which to read and write to files but, just... Read more...

Concatenating Row Values in Transact-SQL
 It is an interesting problem in Transact SQL, for which there are a number of solutions and... Read more...

Why Join

Over 400,000 Microsoft professionals subscribe to the Simple-Talk technical journal. Join today, it's fast, simple, free and secure.