Ben Hall, one of the team who wrote SQL Data Generator, has gone on to develop ways of using it to provide test data for a wide variety of Unit Testing requirements. Whether you use NUnit, XUnit, MbUnit or MSTest, you can now lay your hands on as much data as you need.
SQL Data Generator
You can create large amounts of test data quickly and easily with Red Gate’s SQL Data Generator. You can now integrate this data generation facility into your unit and integration testing
I’d like to show you how I would go about using SQL Data Generator (SDG) within my unit tests, not just for testing the Data Access Layer (DAL), but also the Business Layer (BU). SQL Data Generator can be used for testing a wide variety of code in different languages.
With the code I provide in this article, along with SQL Data Generator, you can test the scalability and performance of any of your .NET code that processes data.
Automating Data Generation
After installing SQL Data Generator, you have the option to generate data via the UI or the command line. The UI can create, configure, and generate data with the ability to save all of the settings as a project file for later use. The project file is XML and contains a list of all the columns for each table within the database. For each column in the table, we store all of the related settings, such as which generator was assigned and the properties for the generator, for example the expression used on a RegEx generator.
Installed alongside the UI is a command line application. The application generates data based on a saved project. However, the application doesn’t include an API. This could cause problems when you attempt to include it as part of an automated process.
As a possible solution to this, I have created a C# wrapper around the console application which you can get by downloading it from Codeplex (including source code) (http://www.codeplex.com/SDGGenerators). This wrapper allows you to integrate SQL Data Generator into your unit tests, with the added advantage of the wrapper being framework agnostic, meaning it will work with all versions of NUnit, XUnit, MbUnit or even MSTest.
To start using this framework, you will need to download the SDG.UnitTesting.dll assembly from CodePlex. In my solution, I have two projects, SDG.UnitTesting.SimpleTalk and SDG.UnitTesting.SimpleTalk.Tests; the first project will be my ‘production’ code which I want to test, while the second project contains my unit tests. Within SDG.UnitTesting.SimpleTalk.Tests, I add a reference using Visual Studio to the SDG.UnitTesting.SimpleTalk project so that I can create the objects to be tested. Within SDG.UnitTesting.SimpleTalk.Tests I also add a reference to the downloaded SDG.UnitTesting.dll assembly.
At this point, I have two possible approaches for executing SQL Data Generator.
- The attribute based approach – Decorate your test methods with attributes to specify which SQL Data Generator project to execute before running the test.
- The POCO Wrapper approach – This allows you to create an object within your code and call the Execute() method to generate the data.
The Attribute Based approach
The first approach is to decorate the TestFixtures and Test methods within your system with our attributes, SDG and SDGExecute. The advantage of this approach is that the data generation is expressed as an attribute rather than being mixed into your test method. This makes it much easier to read and understand the test, and what it is actually testing.
To demonstrate this, I will show how to create and test a DataAccess class. The class uses ADO.net but the basic principal is the same for Linq to SQL, or any other access framework. We’re just issuing commands to the database in order to return data.
The first task, following the Test Driven Development approach, is to create a DataAccessTests class within our test solution. The first test within this class simply ensures that if we delete all the data from the database, then a DataTable is returned with 0 rows.
[ExecuteSql(ExecuteWhen = TestSequence.Before, //Taken from Testing Times Ahead: Extending NUnit
Script = "DELETE FROM [Order Details]; DELETE FROM [Orders]; DELETE FROM CustomerCustomerDemo; DELETE FROM Customers",
Connection = "Server=(local);Database=Northwind;Integrated Security=True")]
public void GetAllCustomers_EmptyDatabase_ZeroRows()
DataTable dt = DataAccess.GetAllCustomers();
The test uses the ExecuteSql attribute, which will execute SQL statements against the server
I showed how to create the ExecuteSql attribute in my previous Extending NUnit article (Testing Times Ahead: Extending NUnit) so you can get the code from there.
This is a great example of how using attributes can improve the readability of code: the logic to setup the test has been moved into an attribute, leaving the actual test code just as test code.
To pass the test, we need to implement our DataAccess class. The DataAccess.GetAllCustomers() method is shown below and it simply executes a SELECT * to return all the customers in the table.
public static DataTable GetAllCustomers()
SqlConnection sqlConn = new SqlConnection(ConfigurationManager.ConnectionStrings["NorthwindConnection"].ConnectionString);
SqlCommand sqlCmd = new SqlCommand("SELECT * FROM Customers", sqlConn);
SqlDataAdapter sqlDA = new SqlDataAdapter(sqlCmd);
DataTable dt = new DataTable();
The test passes successfully. Next we want to test that data is being returned correctly, this is where SQL Data Generator can be beneficial.
In this next test, we want to ensure that the DataAccess object does actually return a populated DataTable with the contents from the database. First, we need to setup our TestFixture class in order to be able to use the new attributes. All we need to do is add a SDGAttribute to the TestFixture and inherit from SDGContext.
These attributes are located in the SDG.UnitTesting assembly; the SDGAttribute tells our extension that the class contains a method which we should be interested in as it should contain test methods that requires SDG to execute. The SDGContext class allows the extension to hook into the class and intercept the method calls, if the method has the attribute SDGExecute, see below, then we know to execute SQL Data Generator.
//[SDG("D:\Red Gate\SQL Data Generator\SQLDataGenerator.exe")] //Overload to set console application location.
On SDGAttribute, we have an overload where you can specify a different installed location for the command line application.
With this in place, we can execute SDG as part of our tests. The following test ensures that the DataTable is populated with 1000 rows and we can be confident that this will always pass because we are executing SDG before the test. Before the test is executed, the SDGContext intercepts the call to GetAllCustomers_SDGProject_1000Rows(), as the method has a SDGExecute attribute, the framework knows to execute SDG. The attribute has an argument for a filename, this is the project file created using the UI which we want to use. Under the covers, the command line application is executed using the project name as a parameter.
public void GetAllCustomers_SDGProject_1000Rows()
DataTable dt = DataAccess.GetAllCustomers();
I find this approach very clean and very effective. We no longer have to worry about what or how the data is generated.
Another advantage of SDG is that it uses seedable data. The means that the same data will be produced each time, making it good to test against. The next test below ensures that the GetCustomerByID() method works correctly; we pass in an ID for a customer generated by SDG and verify that the data is being returned as we expect.
public void GetCustomerByID_SDGProject_1000Rows()
DataTable dt = DataAccess.GetCustomerByID("00040");
Assert.AreEqual("Suppebentor Holdings ", dt.Rows["CompanyName"]);
Assert.AreEqual("Bobbi Yates", dt.Rows["ContactName"]);
Assert.AreEqual("564 White Old Freeway", dt.Rows["Address"]);
Again, the readability of the test is not harmed by the fact we are generating data, unlike other techniques such as inserting data manually using ADO.net
By using attributes, we have an easy and flexible way to generate different data for each test, simply by supplying SDG with a different project file to execute.
However, sometimes we might want to generate data once for the entire fixture.
The POCO Wrapper approach
This wrapper can be called in a similar fashion to the attributes. However it is just a Plain Old CLR Object (POCO) which you create. You can then call the method Execute() anywhere in your code.
At the moment, my DataAccess tests execute the same project three times. However for one test (GetAllCustomers_EmptyDatabase_ZeroRows()) we don’t execute it at all. Therefore, it would make sense to separate these tests into two fixtures, DataAccessTestsV2 and DataAccessV2_WithoutSDG, and executing the project once at the beginning would make the most sense – this is what I have done for my second set of Data Access tests.
In the test fixture DataAccessV2_WithoutSDG, I have all the code which doesn’t require data (GetAllEmployees_EmptyDatabase_ZeroRows) while in my second fixture, DataAccessTestsV2, I have all the code which does require data. Without using any attributes, in my TestFixtureSetup I have this code:
public void TestFixtureSetup()
SDGConsoleWrapper sdgConsoleWrapper = new SDGConsoleWrapper(@"Database\NorthwindEmployees_Orders.sqlgen");
This runs the data generation before any tests are executed within DataAccessTestsV2. The result is that all of my tests have the data required, and I only execute the generation once, speeding up the tests.
During the trail period for SDG you will have a trail screen displayed for each test, as each test executes the console application. You will need to purchase a license to be able to execute without this screen.
Maintainability and coping with schema changes
We now have our data successfully being generated successfully. However database schemas change over time; columns are added and removed, as are tables. Ideally, we don’t want to have to go and modify all of our data generation scripts in order to be able to cope with minor changes. Luckily, SQL Data Generator is able to cope with changing schemas and doesn’t require any modifications to the project file!
When a project is opened in the UI or the console application, we check to see if any of the schema have changed for the tables. If we check for schema changes, such as new or removed columns, data type changes or changes to the relationships between tables, SDG then updates the project as required automatically matching generators to columns when possible.
The result of this is that you can make changes to your database without having to worry about keeping the projects up to date – they will just do it for themselves. Of course, if you want to make changes, then that is also possible by loading the project file in the UI and saving it back to disk.
Testing your Business Layer
While SQL Data Generator can make a great difference when testing against databases, it also has a number of usages while testing your business layer.
Testing the business layer has a number of different aspects, sometimes we just have a simple data model with a well defined set of test cases, such as the method GetTodaysDate() should return today’s date and nothing else. Other times, we have complex business rules and data models which have a large number of different possible combinations and ranges for the data. In order to test this effectively, we need a very good set of test data to work against. SQL Data Generator can help generate the data to meet these combinations and rules quickly and effectively. We can then export this data to use as part of our unit and integration tests.
For the next test, we need a method that will verify that an order is valid. A valid order must have an OrderID, CustomerID and an OrderDate. In normal test cases, we would have to create a series of valid and invalid Order objects and give them to a ValidateOrder method to see if it worked correctly for different combinations. For example, we would have a series of tests like this:
public void ValidateOrder_ValidOrder_ReturnsTrue()
Order o = new Order();
o.CustomerID = 1;
o.OrderID = 1;
o.OrderDate = DateTime.Now.Add(new TimeSpan(0, 1, 0));
However, after a number of different scenarios this would become difficult to manage and difficult to understand what is being tested, especially if we have a number of different tests with only slightly different requirements. Making a change to the code being tested would then break all of our tests, resulting in increased maintenance of the test suite.
One alternative is to use SQL Data Generator and combine it with some of the advanced testing frameworks already available. Both MbUnit (http://blog.benhall.me.uk/2007/04/mbunit-datafixture-data-driven-unit.html) and XUnit (http://blog.benhall.me.uk/2008/01/introduction-to-xunitnet-extensions.html) can use SQL Tables, CSV and XML files as the source of test data. We can then use the data SDG produces as the source of the test data for the unit tests allowing for large amounts of real data to be used in a maintainable fashion.
However, there is still a problem. How do we actually get data into the format we need in order to use the data for testing. Currently, SDG can only insert data into SQL Server 2000 and 2005 databases. However once the data is in there it doesn’t mean it has to stay there. SQL Server includes a tool called Bulk Copy Program (BCP) which can export data to a flat file, for example CSV or XML.
Executing the following command will export data from the table ValidateOrder in the database, TestData and write the data to the CSV file TestData_ValidateOrder.csv, using a comma as the delimiter. We can then include this file as an item within our solution and have it copied to the output directory when we build.
Exporting as XML is a little bit more difficult. Instead of giving a database and table to use as the source, we provide a query. This query selects all the data, uses the built in FOR XML AUTO clause to convert the data into XML elements and then wrap the results in a XML root element.
After executing this, all of our data (1000 rows) is created as XML elements within the XML file. There is no limit to the amount of data we can use. The data produced will look like this:
We can then use this data together with the MbUnit XmlDataProvider (http://blog.benhall.me.uk/2007/04/mbunit-datafixture-data-driven-unit.html) feature to introduce a great deal of flexibility into our unit tests. We can create a series of different xml files with different data scenarios for tests. We can then store them within a source control to share with the data. As they are not accessing the database, they will run faster and we don’t have to worry about having access to the database with the data inserted.
In the following test, we use the DataFixture Attribute with the filename for the xml and an XPath expression defining how to find the data items (we only have one root so we just use that). We stipulate that the test should be executed for each Xml Element of type TestData.dbo.ValidateOrder, passing in the current XML Node for the test.
We can then use the XMLNode to access the data we want to test against.
public class BusinessLayerTests_MbUnit_Valid
public void x(XmlNode node)
Order o = new Order();
//Not nice, but need to get around limitations of Xml and the Xml Serialiser
o.CustomerID = Convert.ToInt32(node.Attributes["CustomerID"] != null ? node.Attributes["CustomerID"].InnerText : "-1");
o.OrderID = Convert.ToInt32(node.Attributes["OrderID"] != null ? node.Attributes["OrderID"].InnerText : "-1");
o.OrderDate = Convert.ToDateTime(node.Attributes["OrderDate"] != null ? node.Attributes["OrderDate"].InnerText : "01/01/1970");
As a result, loading this test will produce 1000 different tests, each using data produced using SDG. This is excellent for integration and boundary tests as it is so quick and easy to produce huge amounts of data to test against.
Hopefully this article has demonstrated some of the possibilities when using SQL Data Generator with your unit testing. Download the wrapper from Codeplex (http://www.codeplex.com/SDGGenerators and experiment to see what kind of difference it makes to your unit tests. Also, keep in mind that the data produced doesn’t have to be used just with databases, it can be used as any kind of test data. When you combine this with the more powerful features of the unit testing frameworks, you can test code faster and more effectively.
Finally, we would love to hear your feedback about the approach taken in this article and SQL Data Generator in general.
The code for this article can be downladed from here, or from the 'speech-bubble' at the top of the article