Click here to monitor SSC
  • Av rating:
  • Total votes: 160
  • Total comments: 5
Robert Sheldon

Implementing Foreach Looping Logic in SSIS

27 March 2012

With SSIS, it is possible to implement looping logic into SSIS's control flow in order to define a repeating workflow in a package for each member of a collection of objects. Rob Sheldon explains how to use this valuable feature of SSIS

One of the handiest features in SQL Server Integration Services (SSIS) is the ability to implement looping logic within your control flow. By using the Foreach Loop container, you can create a looping structure that iterates through a collection of objects or data values (the members) and take actions specific to each of those members. The Foreach Loop container lets you define a repeating workflow, similar to how you would define a Foreach looping structure in a procedural programming language.

For example, suppose you have a folder that contains a set of text files, and each of those files contains data you want to load into a SQL Server database. You can set up a Foreach Loop container in your control flow and then add a Data Flow task to the container. The Data Flow task will retrieve the data from the files and load the data into the database. Because you’ve set up a looping structure, the Foreach Loop container will execute the Data Flow task multiple times, once for each file in the folder.

And the Foreach Loop container is not limited to text files. The container lets you enumerate a variety of collection types, such as rows in a table, SQL Server Management Objects (SMO), the result set of an XML Path Language (XPath) expression, or schema information about a data source.

The best way to learn how to use the Foreach Loop container is to see it in action. To this end, the rest of the article demonstrates how to retrieve data from a single SQL Server view and insert that data into multiple comma-separated values (CSV) files. The source data comes from the vEmployee view in the AdventureWorks2008R2 database. The goal is to create a file for each technician type returned by the view. The files will include the technicians’ first and last names and be saved to a local folder.

NOTE: You can download the SSIS package used for this example from the speech bubble at the top of the article.

Adding Your Variables

When you use the Foreach Loop container to loop through a collection, you need to define a variable that will provide a pointer to the current member, as you loop through the collection. You can define that variable in advance or when you configure the Foreach Loop container. In this case, I create the variable in advance so it’s ready when I need it. I assign the name JobTitle to the variable and configure it with the String data type. For its value, I use a set of quotation marks to represent an empty string; however, you can specify any initial value. If you’re going to implement breakpoints and set up watches to monitor variable values when you run the package, then you might want to assign a meaningful value to the JobTitle variable to provide a better milepost during the iterative process.

Next, I create a variable named JobTitles to hold the collection itself. You do not always need to create a second variable. It depends on the collection type. In this case, because I’ll be retrieving data from a view, I need a variable to hold the result set returned by my query, and that variable must be configured with the Object data type. However, I don’t need to assign an initial value to the variable. The value System.Object is automatically inserted, as shown in Figure 1.

Adding the JobTitle and JobTitles variables to your SSIS package

Figure 1: Adding the JobTitle and JobTitles variables to your SSIS package

Because I created the variables at a package scope, they’ll be available to all components in my control flow. I could have waited to create the JobTitle variable until after I added the Foreach Loop container, then I could have configured the variable at the scope of the container. I’ve seen it done both ways, and I’ve done it both ways. Keep in mind, however, if you plan to use the variable outside of the Foreach Loop container, make sure it has a package scope.

Configuring Your Control Flow

The first step in configuring the control flow is to add a connection manager to the AdventureWorks2008R2 database. In this case, I create a connection to the database on a local instance of SQL Server 2008 R2 and then name the connection manager AdventureWorks2008R2.

Next, I add an Execute SQL task to my control flow in order to retrieve a list of job titles from the vEmployee view. After I add the task, I open the task’s editor and update the value of the ResultSet property to Full result set. I use this setting because the task will return a result set that contains data from the vEmployee view. I then specify the AdventureWorks2008R2 connection manager in the Connection property, and assign the following Transact-SQL statement to the SQLStatement property:

SELECT DISTINCT JobTitle
FROM HumanResources.vEmployee
WHERE JobTitle LIKE '%technician%'

My goal is to return a list of unique job titles that include the word technician. Figure 2 shows the General page of the Execute SQL Task editor after I add the Select statement.

Configuring the General page of the Execute SQL Task editor

Figure 2: Configuring the General page of the Execute SQL Task editor

Because the Execute SQL task has been set up to return a result set, you need some place to put those results. That’s where the JobTitles variable comes in. The task will pass the result set to the variable as an ADO object, which is why the variable has to be configured with the Object data type. The variable can then be used to provide those results to the Foreach Loop container.

So the next step in configuring the Execute SQL task is to map the JobTitles variable to the result set, which I do on the Result Set page of the Execute SQL Task editor, shown in Figure 3.

Configuring the Result Set page of the Execute SQL Task editor

Figure 3: Configuring the Result Set page of the Execute SQL Task editor

To create the mapping, I click Add and then specify the JobTitles variable in the first row of the VariableName column. Notice in the figure that I include the User namespace, followed by two colons. I then set the value in the Result Name column to 0.

That’s all you need to do to configure the Execute SQL task. The next step is to add a Foreach Loop container and connect the precedence constraint from the Execute SQL task to the container. Then you can configure the container. When doing so, you must select an enumerator type. The enumerator type indicates the type of collection you’re working with, such as files in a folder or rows in a table. In this case, because the result set is stored in the JobTitles variable as an ADO object, I select the Foreach ADO enumerator, as shown in Figure 4.

Configuring the Collection page of the Foreach Loop editor

Figure 4: Configuring the Collection page of the Foreach Loop editor

The Foreach ADO enumerator lets you access rows of data in a variable configured with the Object data type. So once I select the enumerator type, I select the JobTitles variable from the ADO object source variable drop-down list. As for the Enumeration mode option, I leave that at its default setting, Rows in the first table, because there’s only one table (with only one column).

After you configure the Foreach Loop container with the collection, you must create a variable mapping that tells the container which variable to use to store the individual member during each loop. You configure variable mappings on the Variable Mappings page of the Foreach Loop editor, as shown in Figure 5.

Configuring the Variable Mappings page of the Foreach Loop editor

Figure 5: Configuring the Variable Mappings page of the Foreach Loop editor

For my example, I create a mapping to the JobTitle variable. To do this, I select the variable from the drop-down list in the first row of the Variable column, and set the index to 0. I use 0 because my collection is taken from the first column of the result set stored in the JobTitles variable. If there were more columns, the number would depend on the column position. The positions are based on a 0 cardinality, so the first column requires a 0 value in the Index column. If my result set included four columns and I was using the third column, my Index value would be 2.

That’s all there is to setting up the Foreach Loop container for this example. After I complete the setup, I add a Data Flow task to the container. My control flow now looks similar to the one shown in Figure 6.

Setting up the control flow in your SSIS package

Figure 6: Setting up the control flow in your SSIS package

When you run the package, the Execute SQL task will retrieve a list of technician-related job titles and save that list to the JobTitles variable. The Foreach Loop container will iterate through the values in the variable. During each loop, the current job title will be saved to the JobTitle variable, and the container will execute any tasks or containers within the Foreach Loop container. In this case, it’s the Data Flow task. That means, for each technician-related job title, the Data Flow task will be executed. So let’s look at how to configure that task.

Configuring Your Data Flow

As you probably know, you edit the Data Flow task on the Data Flow tab of SSIS designer. For this example, I start by adding an OLE DB source component and opening its editor, as shown in Figure 7.

Setting up the control flow in your SSIS package

Figure 7: Configuring the OLE DB source

The first thing to notice is that I specify the AdventureWorks2008R2 connection manager in the OLE DB Connection manager drop-down list. I then select SQL Command from the Data access mode drop-down list and add the following Select statement to the SQL command text box:

SELECT FirstName, LastName, JobTitle
FROM HumanResources.vEmployee
WHERE JobTitle = ?

The statement retrieves employee data from the vEmployee view. Notice that the WHERE clause includes a parameter placeholder (?) to indicate that a parameter value should be passed into the clause. Because I’ve included the parameter, I must now map it to a variable that can provide the necessary value to the WHERE condition. In this case, that variable is JobTitle, which will contain the job title associated with the current iteration of the Foreach Loop container.

NOTE: The query actually need only retrieve data from the FirstName and LastName columns. However, I also included that JobTitle column simply as a way to verify that the data populating the CSV files is the correct data.

To map the parameter to the variable, click the Parameters button on the Connection Manager page of the OLE DB Source editor. The button is to the right of where you add your Select statement. This launches the Set Query Parameters dialog box, shown in Figure 8.

Mapping the JobTitle variable to the parameter in the SELECT statement

Figure 8: Mapping the JobTitle variable to the parameter in the SELECT statement

All I need to do to map the variable to the parameter is to select the variable from the drop-down list in the Variables column in the first row. Once this is done, I’ve completed configuring the OLE DB source and can now add my next component to the data flow: a Flat File destination.

There is nothing at all to configuring the destination itself. I simply add it to the data flow and connect the data flow path from the OLE DB source to the destination component. I then open the destination’s editor and specify that a new Flat File connection manager be created. This launches the Flat File Connection Manager editor, shown in Figure 9.

Configuring a Flat File connection manager

Figure 9: Configuring a Flat File connection manager

I stick with all the default settings for the connection manager, except that I add the following file path to the File Name text box: C:\DataFiles\JobTitle.csv. I then verify that the columns are mapped correctly (on the Columns page). Once I’ve configured the connection manager, my package is about ready to go—except for one important step.

The way the Flat File connection manager is currently configured, it will try to insert all data into the JobTitle.csv file. That means, each time the Foreach Loop container runs the Data Flow task, the job titles from the previous iteration will be overwritten, and the file will contain only those technicians with the job title associated with the final loop. However, one of the goals of this package is to create a file for each job title. That means we need to modify the Flat File connection manager by creating a property expression that changes the filename with each loop, based on the current value of the JobTitle variable.

The easiest way to create the property expression is to open the Properties pane for the connection manager and add a property expression, as shown in Figure 10.

Defining a property expression on your Flat File connection manager

Figure 10: Defining a property expression on your Flat File connection manager

To create a unique file with each loop, I define a property expression for the ConnectionString property. The expression itself concatenates the file path with the JobTitle variable and the .csv file extension:

"C:\\DataFiles\\" + @[User::JobTitle] + ".csv"

Notice that I have to escape the backslashes in the file path by using an additional backslash for each one. Now when I run the package, the current value in the JobTitle variable provides the actual file name when that file is saved to the target folder, thus creating a file for each job title. My data flow is now complete and looks similar to the one shown in Figure 11.

Setting up the data flow in your SSIS package

Figure 11: Setting up the data flow in your SSIS package

If you’ve been creating your own SSIS package as you’ve been working through this article, that package should now be ready to run. At this point, you might find it handy to add a breakpoint to the control flow so you can monitor the JobTitle variable as its value changes with each loop. If you do this, be sure to set the breakpoint on the Data Flow task, not the Foreach Loop container itself. The container runs only once, but the task runs multiple times, so that’s where you’ll see the variable value changing.

Otherwise, you should now have a good sense of how to implement the Foreach Loop container in an SSIS package. Note, however, because I was able to demonstrate only one of the enumerator types available to the container, I was unable to show you features specific to the other types. If you’d like to learn more about them, check out the topics “Foreach Loop Container” and “Foreach Loop Editor (Collection Page)” in SQL Server Books Online. But for now, what I’ve provided here should give you the basics you need to start implementing looping logic in your SSIS packages, regardless of the enumerator type used. And once you get those basics down, you’ll find that the Foreach Loop container is a valuable addition to your arsenal of SSIS tools.

Robert Sheldon

Author profile:

After being dropped 35 feet from a helicopter and spending the next year recovering, Robert Sheldon left the Colorado Rockies and emergency rescue work to pursue safer and less painful interests—thus his entry into the world of technology. He is now a technical consultant and the author of numerous books, articles, and training material related to Microsoft Windows, various relational database management systems, and business intelligence design and implementation. He has also written news stories, feature articles, restaurant reviews, legal summaries, and the novel 'Dancing the River Lightly'. You can find more information at http://www.rhsheldon.com.

Search for other articles by Robert Sheldon

Rate this article:   Avg rating: from a total of 160 votes.


Poor

OK

Good

Great

Must read
Have Your Say
Do you have an opinion on this article? Then add your comment below:
You must be logged in to post to this forum

Click here to log in.


Subject: Get Dynamic File Name
Posted by: Saurav (view profile)
Posted on: Saturday, September 15, 2012 at 6:42 AM
Message: it is great solution..now if i want when the flat gets created it should create different files basis the input value provided from the variable jobtile(used in the above solution)..
ex: exctive.txt , manager.text where jobtitle variable provides the values to final final name...

I tired to add the variable into dynamic expression connection but it does not work somehow'

Can u help thanks

Subject: Error in Implementing Foreach Looping Logic in SSIS
Posted by: joedion (view profile)
Posted on: Tuesday, December 25, 2012 at 9:21 PM
Message: Thanks for the post, trying to figure out these for each loop flow items and this is one of the best I've found so far.

However, I think there is an error in the JobTitle variable. I had to remove the "" in the value in order to make it work. Once I removed the double quotes, worked as expected.

Thanks again, was very helpful, even if I had to do some troubleshooting.

Subject: urgent
Posted by: sullyf50 (view profile)
Posted on: Wednesday, July 17, 2013 at 1:55 PM
Message: am having a bit of issue with this can you give me some advice pls

Subject: for ado loop in ssis
Posted by: sullyf50 (view profile)
Posted on: Thursday, July 18, 2013 at 12:39 PM
Message: I having an issue with this I need some help

its bringing the right table name but wrong record count for tablename when I run the sql

Subject: SSIS and EXEC sp_makewebtask
Posted by: metroman (view profile)
Posted on: Thursday, October 10, 2013 at 12:30 PM
Message: How do I integrate EXEC sp_makewebtask in the above process.I was able to extract N number of text extracts for N number of distinct values in a column as suggested in this article .But now I would like to have formatted output files(HTML or Excel).Can the author ,Robert Sheldon or anyone please assist in integrating EXEC sp_makewebtask in this process.

 

Phil Factor
Searching for Strings in SQL Server Databases

Sometimes, you just want to do a search in a SQL Server database as if you were using a search engine like Google.... Read more...

 View the blog

Top Rated

Searching for Strings in SQL Server Databases
 Sometimes, you just want to do a search in a SQL Server database as if you were using a search engine... Read more...

The SQL Server Sqlio Utility
 If, before deployment, you need to push the limits of your disk subsystem in order to determine whether... Read more...

The PoSh DBA - Reading and Filtering Errors
 DBAs regularly need to keep an eye on the error logs of all their SQL Servers, and the event logs of... Read more...

MySQL Compare: The Manual That Time Forgot, Part 1
 Although SQL Compare, for SQL Server, is one of Red Gate's best-known products, there are also 'sister'... Read more...

Highway to Database Recovery
 Discover the best backup and recovery articles on Simple-Talk, all in one place. Read more...

Most Viewed

Beginning SQL Server 2005 Reporting Services Part 1
 Steve Joubert begins an in-depth tour of SQL Server 2005 Reporting Services with a step-by-step guide... Read more...

Ten Common Database Design Mistakes
 If database design is done right, then the development, deployment and subsequent performance in... Read more...

SQL Server Index Basics
 Given the fundamental importance of indexes in databases, it always comes as a surprise how often the... Read more...

Reading and Writing Files in SQL Server using T-SQL
 SQL Server provides several "standard" techniques by which to read and write to files but, just... Read more...

Concatenating Row Values in Transact-SQL
 It is an interesting problem in Transact SQL, for which there are a number of solutions and... Read more...

Why Join

Over 400,000 Microsoft professionals subscribe to the Simple-Talk technical journal. Join today, it's fast, simple, free and secure.