Click here to monitor SSC
Ben Hall

Embedding IronRuby and the DLR into SQL Data Generator

23 April 2009

You can embed DLR-based languages such as IronPython and IronRuby into applications. This gives you instant scripting, and with it,  the potential for greatly simplifying the development process. You can even take advantage of existing applications that have a plug-in architecture, such as SQL Data Generator and NET Reflector. Be Hall takes SQL Data Generator and shows how this technique allows you to create simple generators specifically for your needs.

Embedding IronRuby and the DLR into SQL Data Generator

SQL Data Generator is already a very versatile application for developing databases. With a few clicks you can populate your database with realistic data. However, there is one problem – it is static.  The built-in generators do an amazing job, the SQL Statement generator, for example, will execute a SQL statement and use the results in the generation process. Yet, if you want to build a more complex structure and have more control over the data that you are generating, then you will have to write our own generator. Up to now, writing a plug-in generator meant doing a separate development that used a compiled .NET language in Visual Studio. Now, there is a neater way of doing this. Microsoft have released the Dynamic Language Runtime (DLR) which allows you to use dynamic languages together with the .Net platform. Part of this work includes Microsoft building implementations of dynamic languages including Python, known as IronPython, and Ruby, known as IronRuby. The result is that you can write Python or Ruby code while taking advantage of existing C# applications and the entire .Net framework.

This opens up an interesting opportunity for SQL Data Generator. By embedding a dynamic language into the application, we can give the users more control over the type of data that they want to generate. Users can now dynamically generate the type of data they wish based on their own requirements. They can still take advantage of the existing generators, but when required, they can write Ruby code in order to produce exactly the data they require.

In this article, I will discuss how to create a custom generator and how embedding the DLR and IronRuby language can open up new functionality. The compiled assembly and source code are attached to this article. Feel free to use the generator within your own projects and let us know what you think.

Creating the plug-in  Generator

The first task is to create the basic plug-in generator. These generators are simple to create and there are several examples that are already available on http://www.codeplex.com/SDGGenerators . First, we need a standard class library – in this case C#.  Creating a  generator in C#  is described in an article on our support pages.

Embedding the DLR Engine

With the basic plug-in generator in place, we can then start the fun task of embedding the DLR into the plug-in generator, and thereby into the SQL Data Generator application itself.  The DLR has a hosting API that is designed to allow languages to be embedded into C# applications. By making use of this, it is extremely simple to embed DLR-based languages into plug-in generators. The first task is to reference the DLR and IronRuby. The assemblies can be downloaded from www.codeplex.com/IronRuby. The first two assemblies are the IronRuby language implementation, while the second two is the DLR itself.

  • IronRuby.dll
    IronRuby.Libraries.dll
    Microsoft.Scripting.dll
    Microsoft.Scripting.Core.dll

In our constructor for the generator, we can now ask IronRuby to create us our DLR ScriptEngine.

private ScriptEngine Engine;

public DLRGenerator()

{

     Engine = IronRuby.Ruby.CreateEngine();
}

The script engine has a number of responsibilities, one of which is to maintain different levels of scope which hold variables and method declarations. The other is to execute the actual code.

Defining Scope

Once we have access to the language engine, we can create our scope. This allows us to create variables to store values or create new methods to use, so as to start writing code as you would in a normal application. Within the application, we could have different scopes based on different parts of the application. In the case of SQL Data Generator, we will just have a single scope.  As with the engine, creating the scope is simple – we call the method CreateScope on the engine.

ScriptScope Scope;
Scope = Engine.CreateScope();

Executing IronRuby code

Once we have created our scope, we can execute our code. Before we can execute code using the DLR, we need to create a ScriptSource object. This is responsible for the actual execution. In a similar fashion to our Scope object, the engine is responsible for the creation of the ScriptSource based on a string containing the code we wish to execute. Along with the code, we need to tell the engine what type of code we want to execute.

 

The DLR handles a number of dynamic languages, each with different ways to execute code. It can, for example, execute code as part of an interactive session, from a file or a standalone statement. From my experience, the most common way is by the Statements option.  If you’re not sure how the dynamic language is executing code, then there is an AutoDetect property to allow the Engine to determine the type. In this example, we will be executing statements.

ScriptSource source = Engine.CreateScriptSourceFromString(code, SourceCodeKind.Statements);

Once we have created our ScriptSource object we can call our Execute method. As an argument, we provide the appropriate Scope object to use during execution.

   object result = source.Execute(Scope);

Because we are using the same scope, we could execute different blocks of code at different points in time, each with knowledge of the previous execution. This allows us to define variables in one context while use them in a different setting.

 Our execution method is completed as below. 

   public string Execute(string code)

{

   ScriptSource source = Engine.CreateScriptSourceFromString(code, SourceCodeKind.Statements);

   object result = source.Execute(Scope);

 

   if (result != null)

      return result.ToString();

   else

      return string.Empty;

}

Within our application, we can now execute the IronRuby code and use the result. For example, given the execution below the result in the variable would be 2.

string execute = Execute("1+1");

However,  we are quite likely to mistype the code we want to execute. If we execute invalid IronRuby code, an exception will be thrown from our ScriptSource object. There are a number of different methods which could be thrown; MethodMissingException or SyntaxErrorException, for example. Because we want to return all the errors back to the user, I will simply catch all the exceptions. I know this is wrong and we should use an application like Exception Hunter to identify and catch the correct exceptions: However, for the project in this article we can catch and report everything. The exception message property contains the reason why the code failed.  Within our generator, we need to add them to a collection so as to report errors back to the user

catch (Exception ex)

         {

            errors.Add(new Warning("Code Error", ex.Message));

            return string.Empty;

         }

To be able to display the error on the UI we need to inherit from the IErrorsAndWarnings.

List<GeneratorMessage> errors = new List<GeneratorMessage>();

      public IEnumerable<GeneratorMessage> GetMessages()

      {

         return errors;

      }

Within this in place, if our code errors during execution then we receive a nice notification.

 We now have everything in place to allow us to execute IronRuby code from our C# application. This means we can use the built-in ruby functions, for example the Random number generator to obtain a value.

string execute = Execute("rand(1000)");

Within our GetEnumerator method we can use the response as the result to use during our data generation. But I will now show that we can be cleverer than this. The first part is handing control to the user.

Giving the users access

While it is fine to be able to execute hardcoded strings, the power comes by offering this functionality to the user. To add a custom UI onto our generator, we need to implement IGeneratorUIStyle and IGeneratorUI. In terms of implementation, I simply provide a textbox for the user to enter the code into. In order to know when to execute the code I check to see if a new line has been inserted, if it has, and the cursor location is at the end of the text box, then I execute my code.

Creating Ruby helper methods

We have now allowed our users to write and execute IronRuby code.  They  have the means to generate a value which we will use to populate the table. However, this isn’t very compelling. We have created an interesting concept but we have yet to take full advantage of it.

One of the ways to take advantage is to execute additional code on behalf of the user. For example, we could automatically assign a variable value for them via executing the Ruby code.

Execute("x = 1");

Another approach is to take advantage of our Scope to create and pre-define a variable available during execution.

Scope.SetVariable("x", 1);

We could even define Ruby methods which people can use when they are writing their own code.

Execute(@"def CallSecretRubyMethod

                        Return 'This is a secret ruby method to help users'

                  end");

While adding additional Ruby methods is great, it still leads to a disconnection between the C# world and the Ruby world within the application. We want the languages to be at one with each other.

Calling C# methods from Ruby

I want to take the approach of having additional methods available to my Ruby code but I want the methods to be written in C#.  The result is that users will be able to write more advanced code, while our methods will have more influence on how the data is generated.

To work the magic I use the Scope object to define a new variable. The value of the variable is the instance of generator itself.

         Scope.SetVariable("column", this);

In order to control the data being executed, I create a new method on my generator called set. This will set the data to use during the generation process. The method is shown below.

   public void set(IEnumerable data)

      {

         Data = data;

      }

This allows us to write code such as Execute(“column.set [1,2,3,4,5,6,7,8,9,0]”); When the IronRuby code is executed, it will call the set method on our generator object passing in the array of ints as an IEnumberable object.  

This means that in our GetEnumerate method we can use this data as the basis for the data being generated.

                Execute(Code);

               if (Data != null)

               {

                  foreach (var d in Data)

                  {

                     yield return d;

                  }

               } 

This is only the start of the power we can now provide. When GetEnumerator is called, a GenerationSession object is provided that has knowledge about the data within the other columns being generated for that particular row. Being able to access this information from a different column is extremely useful. In a similar approach to our column variable I created a second variable called table.  The reason I assigned the same object to two different variables is readability. We want the code being written in the generator UI to be as natural as possible – having appropriately named variables supports this. 

         Scope.SetVariable("table", this);

I then define a new method called row which takes in the column number. I also created an overload which takes in the column name and determines the index before calling the method.  This method accesses our Session object and returns the value in the row for that column.

    public object row(int i)

      {

         if (Session != null)

         {

            IDataRecord row = Session.OutputRow;

            return row.GetValue(i);

         }

         return DBNull.Value;

      } 

However, this leads to a problem. Instead of writing IronRuby code to generate data for our entire column we are now just generating data for that particular column in that particular row. As we are now using single objects I created an overload method to convert our object into an IEnumerable to use during generation.

    public void set(object data)

      {

         Data = new ArrayList { data };        

      }

Within our GetEnumerator we loop forever: You don’t need to worry about this because the SQL Data Generator Engine knows how and when to stop us. We execute the IronRuby code, provided by the user, multiple times to simulate the effect of generating it once per row. I’ve kept the loop around the Data in as this allows the ‘generate’ to return multiple values per execution. 

           while (true)

            {

               Execute(Code);

 

               if (Data != null)

               {

                  foreach (var d in Data)

                  {

                     if (d.Equals(DBNull.Value))

                        continue;

 

                     yield return d;

                  }

               }

               else

               {

                  yield return DBNull.Value;

               }

            }

If the code fails to execute, then the Data variable will be null; as such we only want DBNull to be used in order to alert the user to a problem.

The opportunities this opens

Once this generator is in place, it opens up some extremely exciting opportunities. It is simple to use Ruby to generate data. The real fun part comes when you start combining different sources and creating more interesting data from them.

Our RegEx generator allows you to combine the results of two columns. However, we can now combine our existing columns together, performing string manipulation in the process....

name = table.row("Title") << " " << table.row("FirstName").to_s[0,1] << " " << table.row("LastName")
column.set name

We can use the builtin Ruby options to generate data...

column.set "stressed".reverse #=> "desserts”
column.set "Ho! " * 3 #=> "Ho! Ho! Ho! "

We can also start to perform more complex logic on the data we are generating. Unfortunately, we need to insert the value into a string before converting it into a float because row will return a SqlMoney object in this case...

column.set(table.row("UnitPrice").to_s.to_f * 2.50)

We can also use conditional logic in order to generate our results. 

title = table.row("Title")
if(title == "Mr")
   column.set "M"
elsif(title == "Miss" or title == “Ms”)
  column.set "F"
else
  column.set "?"
end

Real World Examples

After completing the generator I decided to take a look on our support calls to find real problems problems the generator would solve. The first one I spotted was a standard use case.  The user had a table with a number of columns. The data within FileAs column was dependent on the IsCompany column.

CREATE TABLE Contact(id INT IDENTITY(1,1), FirstName NVARCHAR(50), LastName NVARCHAR(50), IsCompany BIT, CompanyName NVARCHAR(50), FileAs NVARCHAR(100))

If IsCompany is true, then it should populate the column with the value of the CompanyName column. Otherwise, it should be a combination of ‘LastName, FirstName’.  With our new generator this is a simple block of code.

if(table.row("IsCompany"))
   column.set table.row("CompanyName")
else
   column.set table.row("LastName") << ", " << table.row("FirstName")
end

The data required based on their own business requirements can now be generated.

A second example is a user who wants to populate each column within the table with slightly different combinations of data.  Each column within the table required a different set of data. Our built-in generators do not support being unable to combine different data sources. An example of the data they wanted to produce was this:

  • “First, add 1000 addresses of type "company address" and fill the columns A,B and C”
  • “Second, add 1000 addresses of type "private address" and fill the columns A,D and E”
  • “Third, add 1000 addresses of type "additional address" and fill the columns B,C and F”

By using our new generator, our user could easily produce data to match their requirements. By assigning each column with the IronRuby Generator we can control the data being produced. We could have our 1000 addresses as external text files and based on our requirements we could simply read the lines in using the standard ruby library. For example, this could be the code for column A.

f = File.new("C:\\company address.txt").readlines
f2 = File.new("C:\\private address.txt").readlines

While column B could be:

f = File.new("C:\\company address.txt").readlines
f2 = File.new("C:\\additional address.txt").readlines

Once we have read the lines in, we combine the two arrays as the data to use during population.

column.set f + f2

While some of this functionality is provided out the box, the ability to use IronRuby as part of the data generation process definitely opens up new opportunities for the application and our users.

NOTE: The DLR requires .Net 2.0 SP1. As such, you will need SP1 installed to use this generator.

NOTE: This Plug-in generator is provided ‘as-is’ and is completely unsupported

Ben Hall

Author profile:

Ben Hall MVP is a UK C# developer/tester who enjoys all aspects of the development lifecycle and technology in general. During the day Ben works for Red Gate Software as a Test Engineer. At night, Ben is a MbUnit Core Commit Member, helping out on the project whenever possible. Ben has also gained his Microsoft Certified Technology Specialist (MCTS) in Web Applications and is a Microsoft Certified Professional (MCP) in Administrating Windows XP. He blogs on http://blog.benhall.me.uk/index.html He was awarded an MVP on July 2nd 2008 for "extraordinary efforts in Visual C# technical communities during the past year."

Search for other articles by Ben Hall

Rate this article:   Avg rating: from a total of 13 votes.


Poor

OK

Good

Great

Must read
Have Your Say
Do you have an opinion on this article? Then add your comment below:
You must be logged in to post to this forum

Click here to log in.


Subject: Can you call other generators?
Posted by: almasmith (view profile)
Posted on: Tuesday, January 26, 2010 at 8:50 PM
Message: Ben, not sure if you are still monitoring this post, but I had a question for you. Can you call the native SDG generators from the custom generator?

In my current situation, I have a bit field and a description (IsOnHold and HoldReason). If IsOnHold is False I need to leave HoldReason NULL. However, if it is True, I would like to call the built in Description generator to fill in the HoldReason.

I'd like to do the same thing with other fields/generators as well. Is this possible? Thanks.

Subject: Calling other generators.
Posted by: almasmith (view profile)
Posted on: Thursday, January 28, 2010 at 11:38 AM
Message: Ok. I figured it out. For anyone who is interested, here's what I did (in my C# code for the generator):

1. Added reference to Redgate.SQLDataGenerator.DescriptionGenerator and using RedGate.SQLDataGenerator.DescriptionGenerators;

2. Created an enum named Generators to track initialized generators

3. Added class members:
GeneratorParameters Parameters
IEnumerator GeneratedData
List<Generators> InitializedGenerators

4. Set Parameters to the paramaters passed into the constructor.

5. Added the following method which can then be called from SQL Data Generator:

public void generatetext()
{
if (!InitializedGenerators.Contains(Generators.Text))
{
// Initialize the data
TextGenerator generator = new TextGenerator(Parameters);
generator.InputText = @"Id eudis quo linguens imaginator pars fecit. Et quad estis vobis homo, si quad fecit, non apparens vantis. Sed quad ut novum vobis regit, et nomen novum eggredior. Longam, e gravis et pladior venit. Tam quo, et bono quorum glavans e funem. Quad rarendum habitatio quoque plorum fecundio, et quis gravis delerium. Versus esset in dolorum cognitio, travissimantor quantare sed quartu manifestum egreddior estum. Multum gravum et plurissimum parte brevens, non quo plorum in volcans essit. Pro linguens non trepicandor si nomen transit.";
generator.MinLen = 100;
generator.MaxLen = 500;
generator.Seed = 1;
GeneratedData = generator.GetNonNullEnumerator(Session);

// Mark as initialized
InitializedGenerators.Add(Generators.Text);
}

GeneratedData.MoveNext();
set(GeneratedData.Current);
}

I embedded python in my generator. Here's an example of the usage:

if table.row("IsOnHold") == True:
column.generatetext()
else:
column.set(None)

Hope this helps.

 

Top Rated

Acceptance Testing with FitNesse: Multiplicities and Comparisons
 FitNesse is one of the most popular tools for unit testing since it is designed with a Wiki-style... Read more...

Acceptance Testing with FitNesse: Symbols, Variables and Code-behind Styles
 Although FitNesse can be used as a generic automated testing tool for both applications and databases,... Read more...

Acceptance Testing with FitNesse: Documentation and Infrastructure
 FitNesse is a popular general-purpose wiki-based framework for writing acceptance tests for software... Read more...

TortoiseSVN and Subversion Cookbook Part 11: Subversion and Oracle
 It is only recently that the tools have existed to make source-control easy for database developers.... Read more...

TortoiseSVN and Subversion Cookbook Part 10: Extending the reach of Subversion
 Subversion provides a good way of source-controlling a database, but many operations are best done from... Read more...

Most Viewed

A Complete URL Rewriting Solution for ASP.NET 2.0
 Ever wondered whether it's possible to create neater URLS, free of bulky Query String parameters?... Read more...

Visual Studio Setup - projects and custom actions
 This article describes the kinds of custom actions that can be used in your Visual Studio setup project. Read more...

.NET Application Architecture: the Data Access Layer
 Find out how to design a robust data access layer for your .NET applications. Read more...

Calling Cross Domain Web Services in AJAX
 The latest craze for mashups involves making cross-domain calls to Web Services from APIs made publicly... Read more...

Web Parts in ASP.NET 2.0
 Most Web Parts implementations allow users to create a single portal page where they can personalize... Read more...

Why Join

Over 400,000 Microsoft professionals subscribe to the Simple-Talk technical journal. Join today, it's fast, simple, free and secure.