Av rating:
Total votes: 66
Total comments: 26


Anith Sen
Faking Arrays in Transact SQL
16 September 2008

It is a simple routine that we all need to use occasionally; parsing a delimited list of strings in TSQL. In a perfect relational world, it isn't necessary, but real-world data often comes in a form that requires one of the surprising variety of  routines that Anith Sen describes, along with sage advice about their use.

Introduction

Sometimes SQL programmers come up with a requirement to use multi-valued columns or variables in an SQL query, especially when the data comes from somewhere else. Since there is no built-in support for any such datatype as an array or list  in t-SQL, quite often folks use delimited VARCHAR strings to lump multiple values together.

We all know that having multiple values in a single column is against the fundamental tenet of relational model since an attribute in a relation can have only one value drawn from an appropriate type.

SQL tables, by their very nature, do not allow multiple values in its columns. This is a sensible stricture which, even before the XML datatype appeared, programmers have occasionally been defeating with varying degrees of success. Users can interpret a single string as a combination of smaller strings usually delimited by a character such as  a comma or a space. Therefore, by leveraging the existing string functions, programmers can extract such smaller parts from the concatenated string.

It is very important to note that some of the methods that we'll describe are kludgy, while others appear to violate the fundamentals of data independence altogether. Such methods are only suggested here only for the sake of completeness, and are not recommended for use in production systems. If you are using any methods that are undocumented in the product manual, use them with due caution and all relevant caveats apply. Basic recommendations for using proper datatype conversion techniques, avoiding positional significance for columns etc must be considered for stable production code.

Considerations

Most of the methods of parsing an array and using it for data manipulation are used to insert data as multiple rows into tables. There are other occasions where the programmer has the task of cleaning and querying  badly-designed data represented in a table.

The following sections illustrate a variety of methods one can employ to identify and enlist subsections of a string represented in a variable, parameter or even as a column value in a table. The examples use a comma separated list, commonly known as a CSV string where the value is represented as:
           'item-1,item-2,item-3....item-n'.
In practice, you can use any character including a space to delimit and improvise the methods accordingly.

For the examples below, a few customer identifiers are randomly chosen from the Customers table in the Northwind database. For each example below, we'll use this list.

DECLARE @p VARCHAR(50)
SET @p = 'ALFKI,LILAS,PERIC,HUNGC,SAVEA,SPLIR,LONEP,GROSR'

Northwind is a sample database in SQL Server 2000 default installations. You can download a copy from the Microsoft Downloads

Direct Data comparison

For simple comparisons, there is no need for complicated routines. The inherent pattern matching features in Transact SQL can be used directly in most cases. One prime example is to pass in a list of values and use it in the IN list of a WHERE clause. Here are some common methods:

Using the CHARINDEX function:

 

SELECT CustomerID, ContactName, CompanyName
      
FROM Northwind.dbo.Customers
    
WHERE CHARINDEX( ',' + CustomerID + ',', ',' + @p + ',' ) > 0 ;
    
/*
    CustomerID ContactName                    CompanyName                              
    ---------- ------------------------------ ------------------------------
    ALFKI      Maria Anders                   Alfreds Futterkiste
    GROSR      Manuel Pereira                 GROSELLA-Restaurante
    HUNGC      Yoshi Latimer                  Hungry Coyote Import Store
    LILAS      Carlos González                LILA-Supermercado
    LONEP      Fran Wilson                    Lonesome Pine Restaurant
    PERIC      Guillermo Fernández            Pericles Comidas clásicas
    SAVEA      Jose Pavarotti                 Save-a-lot Markets
    SPLIR      Art Braunschweiger             Split Rail Beer & Ale

    (8 row(s) affected)*/

Using pattern matching with PATINDEX:

    SELECT CustomerID, ContactName, CompanyName
      
FROM Northwind.dbo.Customers
    
WHERE PATINDEX( '%,' + CustomerID + ',%', ',' + @p + ',' ) > 0   ;

Using LIKE operator for pattern matching

    SELECT CustomerID, ContactName, CompanyName
      
FROM Northwind.dbo.Customers
    
WHERE ',' + @p + ',' LIKE '%,' + CustomerID + ',%' ;

String Parsing

In many cases, you may want to use the parsed list of values as a resultset that can be used in subsequent operations. For instance, for larger lists, it may be more effective to use a JOIN rather using an IN() clause. Another common scenario is the case of multi-row inserts where the list is parsed and the individual elements are inserted using a single INSERT statement.

Using the PARSENAME() function

The PARSENAME function returns a SYSNAME value (equivalent to NVARCHAR(128) and is generally used for identifying part of a database object like object name, owner name, database name, and server name. However it can also be used for parsing a string if the string is made up of less than five delimited values. Typical scenarios for applying this approach would be splitting up a full name, identifying parts of an IP address etc.

For example:

DECLARE @FullName NVARCHAR(500)

SET @FullName = N'Flintstone, Mr.Fred, Jr'

SELECT PARSENAME(FName, 3) AS "Title",

       PARSENAME(FName, 2) AS "FirstName",

       PARSENAME(FName, 4) AS "LastName",

       PARSENAME(FName, 1) AS "Prefix"

  FROM ( SELECT REPLACE(@FullName, ', ', '.')) D(FName)

Solutions using a table of numbers

In most cases with larger strings, the faster solutions are often the ones using a table of sequentially incrementing numbers. However, the performance assertions in general should be taken with a grain of salt since, without testing, it is almost impossible to conclude which method performs better than another.  Before we go through the various solutions that use a table of numbers we'll run through a few of the approaches to creating a number table.

Creating a Number (Sequentially incrementing values) table in T-SQL

A table of monotonically increasing numbers can be created in a variety of ways. Either a base table or a view or any expression that can create a sequence of numbers can be used in these scenarios. Some of the common methods to create a table of numbers  are detailed in the next section.

Though sequentially incrementing values can be generated as a part of the query, generally, it is recommended that you create a permanent base table and insert as many numbers as you need for various solutions. It is also advised to make sure the number column is set as a Primary Key to avoid any potential duplication of rows.

Identity based sequences

Using the default values for an Identity Column in a table, one can simply insert default rows to generate the sequence. One consideration in this approach is that it can be used only with permanent base tables. Note that without any arguments IDENTITY property uses the value 1 for both seed and increment.

    CREATE TABLE dbo.Nbrs(n INT NOT NULL IDENTITY) ;
    
GO
    
SET NOCOUNT ON ;
    
INSERT dbo.Nbrs DEFAULT VALUES ;
    
WHILE SCOPE_IDENTITY() < 500
        
INSERT dbo.Nbrs DEFAULT VALUES ;

With the IDENTITY() function we can use SELECT INTO a new table from any existing table. This example demonstrates using a CROSS JOIN between two existing tables.

    SELECT TOP 500 IDENTITY(INT) AS n
      
INTO dbo.Nbrs
      
FROM Northwind.dbo.Orders o1
    
CROSS JOIN Northwind.dbo.Orders o2 ;
Regular loops using WHILE

Here is a simple procedural loop with inserting one row at a time.

CREATE TABLE dbo.Nbrs(n INT NOT NULL PRIMARY KEY) ;
    
GO
    
SET NOCOUNT ON
    DECLARE
@Index INT ;
    
SET @Index = 1 ;
    
WHILE @Index <= 500 BEGIN
        INSERT
dbo.Nbrs (n) VALUES (@Index) ;
        
SET @Index = @Index + 1 ;
    
END

The efficiency of the above loop ca be improved using the following suggestion from a newsgroup posting by Itzik Ben-Gen.

    SET NOCOUNT ON
    DECLARE
@Index AS INT;
    
SET @Index = 1

    INSERT dbo.Nbrs SELECT 1;
    
WHILE @Index * 2 < 500 BEGIN
        INSERT
dbo.Nbrs
        
SELECT @Index + n
    
      FROM dbo.Nbrs
        
SET @Index = @Index + @@ROWCOUNT
    
END
    INSERT
dbo.Nbrs
    
SELECT @Index + n
      
FROM dbo.Nbrs
    
WHERE @Index + n <= 500
Sequences based on existing tables

Using an existing base table to view can be an easy way to generate a sequence of numbers, esp. when you don’t want to create a permanent table to support a single querty. One drawback to such mechanisms is that as the dataset gets larger, the comparative efficiency of the inserts can become an issue.

Here is an example using a correlated subquery based on the unique column.

      SELECT n
      
FROM ( SELECT ( SELECT COUNT(*)
                    
FROM Northwind.dbo.Orders o2
            
        WHERE o2.OrderId <= o1.OrderId ) AS "n"

          FROM Northwind.dbo.Orders o1 ) Nbrs (n)
   
WHERE n <= 500 ;

This is the same logic as above, but using a self join

    SELECT COUNT(*) AS "n"       FROM Northwind.dbo.Orders o1
    
INNER JOIN Northwind.dbo.Orders o2
        
ON o2.OrderId <= o1.OrderId
    
GROUP BY o1.OrderId
    
HAVING COUNT(*) <= 500 ;

Another quick way to generate the sequence is to use the ROW_NUMBER() function. This example uses a CROSS JOIN just to illustrate the ease of formulation

    WITH Nbrs ( n ) AS (
      
SELECT ROW_NUMBER() OVER (ORDER BY n) AS n
        
FROM ( SELECT s1.id
                
FROM Northwind.dbo.Orders o1
                
CROSS JOIN Northwind.dbo.Orders o2 ) D ( n )
        )
SELECT n
            
FROM Nbrs
          
WHERE n BETWEEN 1 AND 500 ;
With ROW_NUMBER()

For each row returned in a resultset, ROW_NUMBER function returns a sequential number, starting at 1. One can use any existing table, view or any resultset to generate a ROW_NUMBER() expression. Here is an example:  

    SELECT n
      
FROM ( SELECT ROW_NUMBER() OVER ( ORDER BY OrderId )
              
FROM Northwind.dbo.Orders ) D ( n )
    
WHERE n <= 500 ;

As a matter of fact, you can use any arbitrary expression or built-in functions like CURRENT_TIMESTAMP or NEW_ID() in the ORDER BY clause to generate the sequential number.

Approaches using Common table expressions

Common table expressions are aliased resultsets derived from a simple query and defined within the execution scope of a single DML statement. A CTE, with an alias, optional column list and a definition query, can help simplify complex queries and improve readability. In general, any query can be written as a CTE query, but there are certain instances where a query CTE construct can offer the best performing query.

Using a CTE, some of the above methods for generating a sequence of numbers can be simplified.

Here is a simple transformation of derived table of unit digits. The logic is pretty much self explanatory.

     WITH Units ( nbr ) AS (
        
SELECT 0 UNION SELECT 1 UNION SELECT 2 UNION      
        SELECT
3 UNION SELECT 4 UNION SELECT 5 UNION            
        SELECT
6 UNION SELECT 7 UNION SELECT 8 UNION
        SELECT
9 )
    
SELECT u3.nbr * 100 + u2.nbr * 10 + u1.nbr + 1
      
FROM Units u1, Units u2, Units u3
    
WHERE u3.nbr * 100 + u2.nbr * 10 + u1.nbr + 1 <= 500 ;

A Recursive CTE is common table expression can include references to itself. With an anchor member and a recursive member, a recursive CTE can generate a number listing pretty efficiently.  Also note the default recursion limit can be changed by using the OPTION (MAXRECURSION) to the desired limit (0 denotes unlimited).

     WITH Nbrs ( n ) AS (
        
SELECT 1 UNION ALL
        
SELECT 1 + n FROM Nbrs WHERE n < 500 )
    
SELECT n FROM Nbrs
    
OPTION ( MAXRECURSION 500 )

Here is a clever method again by Itzik Ben-Gen using a CROSS JOIN of nested CTE references. This is not only quick but can be used for generating larger datasets.

     WITH Nbrs_3( n ) AS ( SELECT 1 UNION SELECT 0 ),
          
Nbrs_2( n ) AS ( SELECT 1 FROM Nbrs_3 n1 CROSS JOIN Nbrs_3 n2 ),
          
Nbrs_1( n ) AS ( SELECT 1 FROM Nbrs_2 n1 CROSS JOIN Nbrs_2 n2 ),
          
Nbrs_0( n ) AS ( SELECT 1 FROM Nbrs_1 n1 CROSS JOIN Nbrs_1 n2 ),
          
Nbrs  ( n ) AS ( SELECT 1 FROM Nbrs_0 n1 CROSS JOIN Nbrs_0 n2 )
        
SELECT n
          
FROM ( SELECT ROW_NUMBER() OVER (ORDER BY n)
                      
FROM Nbrs ) D ( n )
        
WHERE n <= 500 ;

Parsing a delimited string using a Number Table

Now we have our number table, we can start using it a number of different ways to split fake arrays. The general idea here is to use the sequence value as the second argument for the SUBSTRING() function and then use the WHERE clause to limit the number of splits using a terminating criteria. Note than this general logic can be written in a variety of ways.

    SELECT SUBSTRING( ',' + @p + ',', n + 1,
                
CHARINDEX( ',', ',' + @p + ',', n + 1 ) - n - 1 ) AS "value"
      
FROM Nbrs
    
WHERE SUBSTRING( ',' + @p + ',', n, 1 ) = ','
      
AND n < LEN( ',' + @p + ',' ) ;

Similar logic from a query in a public newsgroup posting by Linda Wiezbecki.

    SELECT SUBSTRING( p, n + 1,
                
CHARINDEX( ',', p, n + 1 ) - ( n + 1 ) ) AS "value"
      
FROM Nbrs
      
JOIN ( SELECT ',' + @p + ',' ) D ( p )
        
ON SUBSTRING( p, n, LEN( p ) ) LIKE ',_%' ;

Here are a couple of methods that not only parses the delimited list, but returns the position of each item in the list as well. This one is using a self join and is popularized by Joe Celko in SQL newsgroups. postings.

    SELECT SUBSTRING( ',' + @p + ',', MAX( n1.n + 1 ),  
                        
n2.n - MAX( n1.n + 1 ) ) AS "value"
          
COUNT( n2.n ) AS "pos"
      
FROM Nbrs n1
      
JOIN Nbrs n2 ON n1.n < n2.n
      
AND n2.n <= LEN( ',' + @p + ',' ) + 1
    
WHERE SUBSTRING( ',' + @p + ',', n1.n, 1 ) = ','
      
AND SUBSTRING( ',' + @p + ',', n2.n, 1 ) = ','
    
GROUP BY n2.n ;

The following is from a posting by Umachandar Jayachandran that uses the REPLACE function to get the number of commas in the string and calculate the position of each item value.

    SELECT SUBSTRING( p, n + 1, CHARINDEX( ',', p, n + 1 ) - n - 1 ) AS "value"
          
LEN( SUBSTRING( p, 1, CHARINDEX( ',', p, n + 1 ) ) ) -
          
LEN( REPLACE( SUBSTRING( p, 1,
                    
CHARINDEX( ',', p, n + 1 ) ), ',', '' ))  - 1 AS "pos"
      
FROM Nbrs
      
JOIN ( SELECT ',' + @p + ',' ) D ( p )
        
ON n BETWEEN 1 AND LEN( p ) - 1
      
AND SUBSTRING( p, n , 1 ) = ',' ;

The following method also gives you the positional value and is simpler

    SELECT SUBSTRING( @p, n, CHARINDEX( ',', @p + ',', n ) - n ) AS "value"
           n
+ 1 - LEN( REPLACE( LEFT( @p, n ), ',', '' ) ) AS "pos"
      
FROM Nbrs  
    
WHERE SUBSTRING(',' + @p, n, 1) = ','
      
AND n < LEN(@p) + 1 ;

With the ROW_NUMBER() and RANK() functions introduced in SQL 2005, one could create the positional values much easily like:

    SELECT SUBSTRING( ',' + @p + ',', n + 1,
                
CHARINDEX( ',', ',' + @p + ',', n + 1 ) - n - 1 ) AS "value"
           ROW_NUMBER
() OVER ( ORDER BY n ) AS "pos"
      
FROM Nbrs
    
WHERE SUBSTRING( ',' + @p + ',', n, 1 ) = ','
      
AND n < LEN( ',' + @p + ',' ) ;

You can wrap any of these methods using a table of sequentially incrementing numbers into a table valued UDF or another stored procedure and make it more generic, reusable and handy.

With Common Table Expressions

Most of the methods described above can be re-written using common table expressions. Here is one way of using it to get the starting position and ending position of each element item and then using SUBSTRING to extract the value:

    WITH CTE ( pos_begin, pos_end ) AS (
        
SELECT n1.n, ( SELECT MIN( n2.n )FROM Nbrs n2
                        
WHERE SUBSTRING( ',' + @p + ',' , n2.n , 1 ) = ','
                          
AND n2.n > n1.n )
          
FROM Nbrs n1
        
WHERE n1.n <= LEN( ',' + @p + ',' ) - 1
          
AND SUBSTRING( ',' + @p + ',' , n1.n , 1 ) = ',' )
    
SELECT SUBSTRING( @p , pos_begin, pos_end - pos_begin - 1 ) AS "Value"
      
FROM CTE ;

Another concise approach with Common table expressions is to use a recursive CTE. It does not use a table of sequentially incrementing numbers, however it can be a bit inefficient for larger strings.

Here is an example.

    WITH CTE ( pos, pos_begin, pos_end ) AS (
      
SELECT 0, 1, CHARINDEX( ',', @p + ',' )
        
UNION ALL
      
SELECT pos + 1, pos_end + 1, CHARINDEX( ',', @p + ',', pos_end + 1 )
        
FROM CTE
        
WHERE CHARINDEX( ',', @p + ',', pos_end + 1 ) > 0 )
    
SELECT pos + 1, SUBSTRING( @p, pos_begin , pos_end - pos_begin ) AS "value"
      
FROM CTE
    
OPTION ( MAXRECURSION 0 ) ;

A similar approach is suggested by Hari Mohan, a systems analyst with BFL software:

;WITH cte (pos_begin, pos_end) AS

(   SELECT 0, 1

     UNION ALL

    SELECT pos_end, charindex(',', @p, pos_end) + 1

      FROM cte

     WHERE pos_end > pos_begin )

SELECT SUBSTRING(@p, pos_begin,

            CASE WHEN pos_end > 1

                THEN pos_end - pos_begin - 1

                ELSE LEN(@p) - pos_begin + 1

            END ) AS "value",

       RANK() OVER ( ORDER BY pos_begin )    

  FROM cte

 WHERE pos_begin > 0

OPTION ( MAXRECURSION 0 ) ;

As mentioned before, any of the above mentioned parsing routines can be written as a table valued user defined function or a view and can be used directly in the queries like:

    SELECT CustomerID, ContactName, CompanyName
      
FROM Northwind.dbo.Customers c
      
JOIN dbo.udf_parsed_list () p
        
ON c.CustomerID = p.value ;

Methods that replace the delimiters

Apart from the above mentioned approaches there are certain other tricks which can be used in Transact SQL. Basically the following methods uses Dynamic SQL, a bit different from traditional queries, but can be used as an approach for smaller string parsing requirements in certain cases.

Here is an example using Dynamic SQL with IN list in the WHERE clause

    DECLARE @SQLx NVARCHAR(4000)
    
SET @SQLx = N'
    SELECT CustomerID, ContactName, CompanyName
      FROM Northwind.dbo.Customers  
     WHERE CustomerID IN ( '''
+ REPLACE( @p, N',', N''',''' ) + N''' )' ;
    
EXEC sp_ExecuteSQL @SQLx ;

This is another illustration of replacing the element delimiters with ' UNION SELECT '. The results can be stored in a table variable or table variable and then can be used in subsequent operations.

    DECLARE @tbl TABLE ( val VARCHAR(10) NOT NULL PRIMARY KEY );
    
DECLARE @SQLx VARCHAR(8000)
    
SET @SQLx = 'SELECT ''' + REPLACE( @p, ',', ''' UNION SELECT ''') + ''''
    
INSERT @tbl EXEC( @SQLx ) ;
    
SELECT val FROM @tbl ;
 

A similar approach is proposed by Alejandro Mesa in a public newsgroup posting that uses separate INSERT statements as well. Here is an example where @t is a valid table created prior to execution:

    DECLARE @SQLx VARCHAR(8000)
    
SET @SQLx= 'INSERT ' + @t + ' VALUES (' +
                  
REPLACE( @p, ',', ' ) INSERT ' + @t + ' VALUES (') + ')'
    
EXEC ( @SQLx ) ;

XML solutions

OPENXML function

An approach that is getting much attention, is the OPENXML method. This method, thought may not be ideal for larger datasets, can be effectively used for relatively small number of items, especially if the XML string is generated from a client application. The basic idea is to pass the values as an XML document to the stored procedure instead of a CSV. Here is an example. Note that you can use the REPLACE function to change a CSV string to XML format, say within a stored procedure for instance.

    DECLARE @doc VARCHAR(500)
    
DECLARE @XMLDoc INT
    SET
@doc = '
    <ROOT>
    <Customer pos="1" id="ALFKI"></Customer>
    <Customer pos="2" id="LILAS"></Customer>
    <Customer pos="3" id="PERIC"></Customer>
    <Customer pos="4" id="HUNGC"></Customer>
    <Customer pos="5" id="SAVEA"></Customer>
    <Customer pos="6" id="SPLIR"></Customer>
    <Customer pos="7" id="LONEP"></Customer>
    <Customer pos="8" id="GROSR"></Customer>
    </ROOT>'          

 EXEC sp_xml_preparedocument @XMLDoc OUTPUT, @doc ;
    
SELECT pos, id
      
FROM OPENXML ( @XMLDoc , '/ROOT/Customer', 1 )
              
WITH ( pos INT, id VARCHAR(5) )  ;
    
EXEC sp_xml_removedocument @XMLDoc ;
 

One could generate the xml document from the csv variable with only the id values easily like:

    SELECT '
    <ROOT>
    <Customer id="'
+
            
REPLACE( @p, ',', '"></Customer>
    <Customer id="'
) + '"></Customer>
    </ROOT>'

The nodes() method

This is a feature that was introduced in SQL Server 2005 using the xml datatype. You can use the nodes() method to shred the data that can be mapped to a new row. The value() method can be applied as shown below to extract the individual elements:

    DECLARE @doc XML;
    
SET @doc = '
    <ROOT>
    <Customer pos="1" id="ALFKI"></Customer>
    <Customer pos="2" id="62;</Customer>
    <Customer pos="3" id="PERIC"></Customer>
    <Customer pos="4" id="HUNGC"></Customer>
    <Customer pos="5" id="SAVEA"></Customer>
    <Customer pos="6" id="SPLIR"></Customer>
    <Customer pos="7" id="LONEP"></Customer>
    <Customer pos="8" id="GROSR"></Customer>
    </ROOT>'          
    
SELECT D.element.value('@id', 'VARCHAR(5)'),
          
D.element.value('@pos', 'INT')
      
FROM @doc.nodes('/ROOT/Customer') AS D ( element )
 

Using a CLR function

With the popularity of CLR functions in SQL 2005 and beyond, string parsing has become much more versatie. It is much easier to create a simple CLR function, deploy the assembly and then use it as a function within t-SQL code. To implement this function you will have to enable the database option CLR enabled.Here is a C# function example that you can create in Visual Studio under the database project :  

using System;

using System.Data;

using System.Data.SqlClient;

using System.Data.SqlTypes;

using Microsoft.SqlServer.Server;

using System.Collections;

 

public class ParseList

{

    [SqlFunction(FillRowMethodName = "FillRow")]

 

    public static IEnumerable CLRParseString(SqlString csv)

    {

        SqlString Delim_ = ",";

        return csv.Value.Split(Delim_.Value.ToCharArray(0, 1));

    }

 

    public static void FillRow(object value, out string csv)

    {

        csv = (string)value;

    }

}

Once you build this function and created the .dll you can create an assembly and a table valued function that can be accessed from within t-SQL

CREATE ASSEMBLY ParseList FROM 'C:\ParseList.dll'

WITH PERMISSION_SET = SAFE

GO

CREATE FUNCTION CLRParseString ( @str NVARCHAR(4000))

RETURNS

TABLE ( val NVARCHAR(4000) )

AS EXTERNAL NAME ParseList.ParseList.CLRParseString;

GO

The usage is pretty straight forward like:

SELECT * FROM CLRParseString( @p )

You can find several comparable CLR functions in newsgroups and blogs with heavy discussions on the performance implications of using them. For some of the popular benchmarks I would recommend the article by Erland mentioned in the references section.

Simple WHILE loop

Another popular method is to use a procedural WHILE loop. Very popular among programmers by virtue of its simplicity, this method is not particularly efficient for larger datasets. Here is an example:

    WHILE LEN( @param ) > 0 BEGIN
       IF CHARINDEX
( ',', @param ) > 0
          
SELECT @val = LEFT( @param, CHARINDEX( ',', @param )  - 1 ) ,
                  
@param = RIGHT( @param, LEN( @param ) - CHARINDEX( ',', @param ) )
      
ELSE
           SELECT
@val = @param, @param = SPACE(0)
      
EXEC('INSERT tbl VALUES (' + @val + ')' )
    
END

Inserting parsed values into multiple columns

When the passed values are to be returned as column values in a single row rather than multiple rows, we can use the PIVOT function. Since we will always know the number of columns that will be in the resultset, formulating the query with a static PIVOT list will not be too difficult.

SELECT R.*

  FROM ( SELECT SUBSTRING( ',' + @p + ',', n + 1,

                CHARINDEX( ',', ',' + @p + ',', n + 1 ) - n - 1 ) AS "value",

                ROW_NUMBER() OVER ( ORDER BY n ) AS "pos"

           FROM Nbrs

          WHERE SUBSTRING( ',' + @p + ',', n, 1 ) = ','

            AND LEN( ',' + @p + ',' ) > n ) AS  parsed

PIVOT ( MAX("value") FOR pos IN ("1", "2", "3", "4", "5", "6", "7", "8") ) R

 

Conclusion

Be judicious and apply common sense while using any of the string parsing routines in t-SQL. Each of the methods described above has certain benefits in some scenarios and may not be appropriate for all situations. If you need additional detailed analysis of these and a few other methods including performance considerations, consider the articles in the references section.

References

Acknowledgements

Aaron Bertrand, Erland Sommarskog, Umachandar Jayachandran, Linda Wierzbecki, Joe Celko, Alejandro Mesa, Kristen Hodges and Itzik Ben-Gan.



This article has been viewed 12120 times.
Anith Sen

Author profile: Anith Sen

Anith S Larson is an independent consultant specializing in data management primarily using SQL Server. From the mid 90s, he has been working on a variety of database design and application developement projects for clients primarily in wealth management and financial services industry. Besides having a few IT certifications and a Masters degree in Computer Applications, Anith is currently working towards his MBA. He resides in Lakeland, TN with his family.

Search for other articles by Anith Sen

Rate this article:   Avg rating: from a total of 66 votes.


Poor

OK

Good

Great

Must read
 
Have Your Say
Do you have an opinion on this article? Then add your comment below:


Subject: Great Article - Thanks
Posted by: jerryol (view profile)
Posted on: Wednesday, September 17, 2008 at 12:26 AM
Message: Great article man, thanks.

Subject: Impressive!
Posted by: KeithFletcher (view profile)
Posted on: Wednesday, September 17, 2008 at 3:19 AM
Message: Thanks for this! I use one or two of the methods you've mentioned here, a lot more often than I'd like to admit, so it's nice to see a few extra alternatives.

Never thought to going the XMl route...

great article

Subject: Excellent
Posted by: Viswanath (not signed in)
Posted on: Wednesday, September 17, 2008 at 7:52 AM
Message: Excellent Study

Subject: Great article!
Posted by: Anonymous (not signed in)
Posted on: Wednesday, September 17, 2008 at 8:17 AM
Message: Thanks for the article and I just know that this information will be kept in my bag of tricks.

Subject: Sweet!
Posted by: Chad (view profile)
Posted on: Wednesday, September 17, 2008 at 9:44 AM
Message: There are several methods here I hadn't thought of!

Subject: Do you have a link to Itzik's newsgroup post?
Posted by: Trevor (not signed in)
Posted on: Wednesday, September 17, 2008 at 9:49 AM
Message: That you reference in the index + n section: "The efficiency of the above loop ca be improved using the following suggestion from a newsgroup posting by Itzik Ben-Gen."

Subject: Re: Do you have a link to Itzik's newsgroup post?
Posted by: Anith Sen (not signed in)
Posted on: Wednesday, September 17, 2008 at 10:16 AM
Message: Trevor,

He has posted it several times, here is one:
http://groups.google.com/group/microsoft.public.sqlserver.programming/msg/cc9de24464c167fc

Anith

Subject: Using a UDF to parse delimited text to a table
Posted by: Anonymous (not signed in)
Posted on: Wednesday, September 17, 2008 at 10:17 AM
Message: I've been using this function for quite some time now. Works great.

http://www.codeproject.com/KB/database/SQL_UDF_to_Parse_a_String.aspx

Subject: What about xml?
Posted by: AJ Tech (not signed in)
Posted on: Wednesday, September 17, 2008 at 10:39 AM
Message: I would use xml. Code is simpler and more robust in a sense that I can validate it against a schema.

Subject: My current version
Posted by: Jason Hannas (view profile)
Posted on: Wednesday, September 17, 2008 at 12:02 PM
Message: I've been seeing a lot of these types of articles/postings lately. A few months ago, I took several ideas from the best of these articles and put together a function that's working very well for me. I don't take credit--the basic concept is someone else's idea (but I just can't remember who/where I learned it). It requires a numbers table, although you could use another method of generating a temporary sequence if you wanted.

CREATE FUNCTION dbo.fnListToTable
(
@List varchar(max),
@delimiter char(1)
)
RETURNS TABLE
AS

RETURN
( SELECT
ROW_NUMBER() OVER(ORDER BY Number) AS [Element],
SUBSTRING(List,Number + 1,CHARINDEX(@delimiter,List,Number + 1) - Number - 1) AS [Value]
FROM
( SELECT
@delimiter + @List + @delimiter AS List ) AS l
CROSS JOIN admin.tbl_Numbers AS n
WHERE
Number < LEN(List)
AND SUBSTRING(List,Number,1) = @delimiter
) ;


Subject: Use With Caution
Posted by: wbw (not signed in)
Posted on: Wednesday, September 17, 2008 at 8:41 PM
Message: I've used both the XML route and the CHARINDEX, primarily for input parameters that can be multivalued. It simplifies the capability to handle flexible data input, but potentially complicates the front-end code (i.e. non database code), especially the XML route.

Additionally, these types of fancy, "slick" and very non-standard fixes to complicated problems tend to cause nothing but confusion down the road.

And a final caveat: in the wrong hands, even the best intentions can go awry. As the author states, use sparingly.

Subject: great
Posted by: Anonymous (not signed in)
Posted on: Thursday, September 18, 2008 at 5:17 AM
Message: There are several methods here I hadn't thought of
and if you can give us more ! thanks

Subject: Jason Hannas version is best, I think
Posted by: Anonymous (not signed in)
Posted on: Thursday, September 18, 2008 at 9:06 AM
Message: I think the Jason Hannas post is close to the version found in Ben-gan book (Inside Microsoft SQL Server 2005: T-SQL Programming). It makes really clever use of the clustered index on the numbers table, and the performance is a million times better than any of the looping / xml versions. I no longer have the book, but I believe most of a chapter is dedicated to just this topic. Pick it up, the book is well worth it.

Subject: The best article on the subject i've ever seen.
Posted by: Theo Ekelmans (not signed in)
Posted on: Wednesday, September 24, 2008 at 2:49 AM
Message: nuff sdaid.

Subject: I have used some in practice and tested a few others
Posted by: Anonymous (not signed in)
Posted on: Wednesday, September 24, 2008 at 3:36 AM
Message: And I found that everything based on number tables (especially dynamicly generated ones) perform really bad when the string to be parsed increases.

Best all round is contrary to what the article states, the procedural loop that returns a table (which usualy needs to be indexed for best performance)

Also, I like to point out that the version of this listed in this article is about as bad (performance wise) as you can program it. There is TOO much data being moved around, like the redefinition of the input variable/parameter.

It is better to use charindex in combination with sunstrings and insert these substrings into the result table.

If you need to access a specific position in your delimited list, make a seperate function for that, that too will perform tons better then generating a whole table only to access just 1 item in the list.

I think the article shows a lot of fancy SQL that in practice when found in production..people will have a hard time understanding of what it does, how it does it and most of all...what the intensions behind it are if not instructed beforehand.

Functions, in whatever form will always be the most natural self describing from this point of view and should be preferred above all else!


Subject: have used some in practice and tested a few others (2)
Posted by: Anonymous (not signed in)
Posted on: