Click here to monitor SSC
  • Av rating:
  • Total votes: 7
  • Total comments: 0
Grant Fritchey

SQL Virtual Restore and the Sandbox Development Environment

15 July 2011

It is a problem to  provide enough data for database development work, especially when it has to be managed, and has to comply with the conditions for using live data. Added to that are the constraints of disk space and the problem of providing risk-free deployment. Grant explains how three Red Gate tools and work together to provide a neat solution to providing the ideal sandbox development environment.

It isn't easy to get enough meaningful data to work with, for developing databases. As a DBA, I would prefer developers to use very small databases consisting of their own data or a well-defined set of test data. By having  smaller data sets, it is easier for the DBA  to manage the myriad development and QA environments. However, the testers and developers would rather not spend time creating dummy data when perfectly good production data is available. They have a very good point, because the best data to develop against and test with is the data that the users of the application will see. How do you resolve this disparity in approach?

Many, perhaps most, organizations, just go with using production data in their development environments. It speeds up the development process and makes testing more accurate. But there are snags. First, you can’t expose all the company data to everyone in the company, or risk exposing it to anyone outside the company: This means that you must put mechanisms in place to remove or modify the sensitive data. Second, frankly, it takes up a lot of space. Sometimes, in my experience, more (much more) space was dedicated to development and QA than was dedicated to production. This was because they had many copies of the production database for development, multiplying the storage requirements.

There can be another problem: You can hit trouble if you don’t have an established and thoroughly documented methodology for deploying your database. It’s very easy to set up a deployment for a development sandbox machine when you’re building it from scratch and supplying your own data. However, when you introduce the complication of bringing the production database and its data into your development sandbox, you must come up with a much more sophisticated approach.

Given these problems, what can you do to make a difference?

SQL Virtual Restore and SQL Source Control

There are two core problems: you need to have a mechanism to get at production data without having to have terabytes of storage and you need a mechanism for deployment that updates production database schema from source control. Red Gate’s SQL Virtual Restore and SQL Source Control is a match made in heaven for the development sandbox, answering both questions directly. These tools will also help to solve the more detailed problems such as cleaning up sensitive data.

In this article, I’ll show several ways that these tools can be used together to define a sandbox server and deploy to it as part of a development process. For simplicity, we’ll imagine a database that has been under development for some time. It’s checked into source control through SQL Source Control and has been deployed to production. Now we want to move the data and structure from production back down to our sandbox environment, in order to begin a new phase of development against the database. The first step is to set up a database in SQL Source Control.

I’m starting with a blank database, BookInventory. Since I like to work directly out of source control at all times, I’m going to link to it right away. This is the initial SQL Source Control screen, prior to linking the database:

Initial SQl Source Control Screen

If I click on the “Link database to source control…” link, highlighted in blue on the page above, it will open the “Link to Source Control” window. In this window, I can select different source control systems to hook up to. Red Gate even provides a version of SVN that can run for you during evaluation, so you can try the system out. The built-in systems are SVN, TFS and Vault. If you have a source control system other than these, and it has a command line interface, you can configure it to work with SQL Source Control.

On the “Link to Source Control” window we’re also going to pick whether or not this database is a shared resource, or a dedicated database. Since we’re looking at setting up a sandbox machine, it’s a dedicated database. Clicking on the OK button will create the link. The SQL Source Control screen should now look like this:

Setting up the dedicated database

Although this database is linked, there is nothing to pull out of source control in this instance, because I’ve just added a blank database. So I’ll add a couple of tables to my database like this:

CREATE TABLE Book

    (

     BookId INT NOT NULL,

     BookName VARCHAR(150) NOT NULL,

     CONSTRAINT BookPk PRIMARY KEY CLUSTERED (BookId)

    ) ;

 

CREATE TABLE Author

    (

     AuthorId INT NOT NULL,

     AuthorName VARCHAR(150) NOT NULL,

     CONSTRAINT AuthorPk PRIMARY KEY CLUSTERED (AuthorId)

    ) ;

 

CREATE TABLE AuthorBook

    (

     BookId INT NOT NULL,

     AuthorId INT NOT NULL,

     CONSTRAINT AuthorBookPk PRIMARY KEY CLUSTERED (BookId, AuthorId)

    ) ;

 

ALTER TABLE AuthorBook  WITH CHECK ADD CONSTRAINT AuthorBookFKAuthor FOREIGN KEY(AuthorId)

REFERENCES Author (AuthorId) ;

 

ALTER TABLE AuthorBook  WITH CHECK ADD CONSTRAINT AuthorBookFKBook FOREIGN KEY(BookId)

REFERENCES Book (BookId) ;

Once the objects needed for production are complete, I’ll check them into source control:

Database with tables

I’m going to go ahead and load some data into the database using SQL Data Generator, just to provide something to query against, and so that the database is larger than the default. I inserted just 15,000 rows into the tables and grew the full size of the database out to 15mb.

This database now gets deployed to production and, as the application it supports gets used, it becomes quite large, so that copying it into our development environment will be far too expensive. Instead, I’m going to apply SQL Virtual Restore to a recent backup of the production database to create a virtual database that will become the new development database. The restore operation is very simple to show in T-SQL:

RESTORE DATABASE [BookInventory_Virtual]
FROM DISK = ‘d:\bu\BookInventory.bak’
WITH MOVE N’BookInventory’ TO N’C:\Data\BookInventory_Virtual.vmdf’,
MOVE N’BookInventory_log’ TO N’C:\Data\BookInventory_Log_Virtual.vldf’

It’s basically a normal restore statement, but the addition of the letter "v" to the file names acts as a trigger for the HyperBac Service - the Windows service through which SQL Virtual Restore runs - to create this as a virtual database. We’ll call it BookInventory_Virtual, SQL Virtual Restore’s default suggestion. When the restore is completed, the amount of space used by the database itself is only 1mb, 14mb less than the original. This saving becomes even more dramatic as the size of the original database goes up. In my own testing I’ve seen a database of over 100gb get created as a 3mb virtual database. The space saving comes about because we’re not actually creating a new database. Instead, we’re creating a virtual database which just keeps small file stubs for the virtual database. All the data is still in the backup file. As you make changes to data, the changes are stored in the file stubs, so that over time these may grow beyond their initial small size.

Once the virtual database is created in our sandbox environment, we will need to attach it to the existing database in our source control system through SQL Source Control. The linking process is exactly the same as in the initial example. At this point in our tests, there are no differences between what we’re referring to as production, our backup of the original database, and the new virtual database just created. This means that after linking our database to source control we won’t need to move anything into or out of source control.

Now then, let’s assume that we need to add a stored procedure to the database. Here’s a good one:

CREATE PROC ListBookAndAuthor

AS

    SELECT  a.AuthorName,

            b.BookName

    FROM    dbo.Author AS a

            JOIN dbo.AuthorBook AS ab ON a.AuthorId = ab.AuthorId

            JOIN dbo.Book AS b ON ab.BookId = b.BookId;

And of course this is created in our sandbox environment, BookInventory_Virtual. Once it’s created, we’ll need to check it in through SQL Source Control. Let’s assume that another developer is working on the same system. He’s going to go through the same process, creating a virtual database and then hooking his database up to source control through SQL Source Control. Once he’s configured, he’ll get an alert that there are things available in the Get Latest window:

The get Latest Window

This other developer can click on the 'Get Latest' button and just move forward from there. Clearly, in a normal development environment, this sort of thing can go round and round for a bit. But what’s going on with the backup file in the meantime?

The backup file itself is not changing. However, the very small placeholders created by the HyperBac Service, which define your virtual database, will be growing as you add objects or modify data. These will store the information for all system modifications over time, and that’s something to keep an eye on. You wouldn’t want to keep a virtual in place too long on your system, because these files will grow to the size of the original, and more, depending on what you are doing with your database.

At some point, we’ll want to get a new copy of the database from production. This could be because of changes to the production data that you want to have available for testing, or because your virtual files have started to grow. If we simply rerun the restore operation that created our virtual in the first place, and this time, replace the existing database, you’ll get a new virtual database in its place. But what happens to SQL Source Control when we do this? Nothing at all. It recognizes that it’s still the same database that was attached previously, but, the next time we go to that database with the SQL Source Control window open, we will see all outstanding changes waiting for us in the Get Latest window.

Conclusion

Using SQL Virtual Restore and SQL Source Control together to create and maintain a development environment based on your production database is a match made in heaven. You will have the ability to quickly and easily get a copy of production on your development machine, using very little disk space. Then, you can develop and deploy your database directly out of source control, so that you achieve tight integration with your development teams.

To download the recording of Grant's webinar 'A Sandbox Development Process', click here.

Grant Fritchey

Author profile:

Grant Fritchey, SQL Server MVP, works for Red Gate Software as Product Evangelist. In his time as a DBA and developer, he has worked at three failed dot–coms, a major consulting company, a global bank and an international insurance & engineering company. Grant volunteers for the Professional Association of SQL Server Users (PASS). He is the author of the books SQL Server Execution Plans (Simple-Talk) and SQL Server 2008 Query Performance Tuning Distilled (Apress). He is one of the founding officers of the Southern New England SQL Server Users Group (SNESSUG) and it’s current president. He earned the nickname “The Scary DBA.” He even has an official name plate, and displays it proudly.

Search for other articles by Grant Fritchey

Rate this article:   Avg rating: from a total of 7 votes.


Poor

OK

Good

Great

Must read
Have Your Say
Do you have an opinion on this article? Then add your comment below:
You must be logged in to post to this forum

Click here to log in.
 

Phil Factor
Searching for Strings in SQL Server Databases

Sometimes, you just want to do a search in a SQL Server database as if you were using a search engine like Google.... Read more...

 View the blog

Top Rated

SQL Server XML Questions You Were Too Shy To Ask
 Sometimes, XML seems a bewildering convention that offers solutions to problems that the average... Read more...

Continuous Delivery and the Database
 Continuous Delivery is fairly generally understood to be an effective way of tackling the problems of... Read more...

The SQL Server Sqlio Utility
 If, before deployment, you need to push the limits of your disk subsystem in order to determine whether... Read more...

The PoSh DBA - Reading and Filtering Errors
 DBAs regularly need to keep an eye on the error logs of all their SQL Servers, and the event logs of... Read more...

MySQL Compare: The Manual That Time Forgot, Part 1
 Although SQL Compare, for SQL Server, is one of Red Gate's best-known products, there are also 'sister'... Read more...

Most Viewed

Beginning SQL Server 2005 Reporting Services Part 1
 Steve Joubert begins an in-depth tour of SQL Server 2005 Reporting Services with a step-by-step guide... Read more...

Ten Common Database Design Mistakes
 If database design is done right, then the development, deployment and subsequent performance in... Read more...

SQL Server Index Basics
 Given the fundamental importance of indexes in databases, it always comes as a surprise how often the... Read more...

Reading and Writing Files in SQL Server using T-SQL
 SQL Server provides several "standard" techniques by which to read and write to files but, just... Read more...

Concatenating Row Values in Transact-SQL
 It is an interesting problem in Transact SQL, for which there are a number of solutions and... Read more...

Why Join

Over 400,000 Microsoft professionals subscribe to the Simple-Talk technical journal. Join today, it's fast, simple, free and secure.