(Updated 14th Jan 2012, and again 26th Jan 2012)
Every so often, the question comes up on forums of how to pass a list as a parameter to a SQL
procedure or function. There was a time
that I used to love this question because one could spread so much happiness and gratitude by showing how to parse a comma-delimited
list into a table. Jeff Moden has probably won the laurels for the ultimate list-parsing TSQL function, the amazing ‘DelimitedSplit8K’.
Erland Sommarskog has
permanently nailed the topic with a very complete coverage of the various methods for using lists in SQL Server on his website.. With Table-Valued parameters, of course, the necessity for having any lists
in SQL Server is enormously reduced, though it still crops up.
Element-based XML seems, on the surface, to provide a built-in way of handling lists as parameters. No need for all
those ancillary functions for splitting lists into tables, one might think. Yes, indeed, but be careful of the XQuery
syntax that you use, as we'll see.
Let’s just take the very simplest example of taking a list of integers and turning it into a table
of integers. One can use a simple element-based XML list based on a fragment like this…
DECLARE @XMLlist XML
SELECT @XMLList
=
'<list><i>2</i><i>4</i><i>6</i><i>8</i><i>10</i><i>15</i><i>17</i><i>21</i></list>'
…to give a table like this ..
IDs
-----------
2
4
6
8
10
15
17
21
(8 row(s) affected)
..by using this TSQL expression
SELECT x.y.value('.','int')
AS IDs
FROM @XMLList.nodes('/list/i')
AS x ( y
)
It isn’t as intuitive as a simple comma-delimited list to be sure, but it is bearable. In small
doses, this shredding of element-based XML lists works fine, but,<added 26th January> using this particular syntax for getting the value of the element </added 26th January> one soon notices the sigh from the server as the length of the list
starts to increases into four figures. Try it on a list of a hundred thousand integers and you’ll have time to eat a
sandwich.
OK. Let’s do a few measurements and do a graph.

Jeff Moden’s Split function, scales beautifully as shown by the red line, and it is difficult to
measure it, it is so quick. The XML eShred technique by contrast, using the element-based list, and this XQuery syntax, exhibits horrible scaling. This sort of
curve is enough to strike terror into the developer. Just as a comparison, I did
the process of taking a list and turning it into a table by creating a VALUES expression from the list. Even this took
longer than Jeff’s function, presumably because of the overhead of compiling the expression. Of course, the comparison
is rather unfair because this third approach , the ‘Using Values Script’ approach, does no validation, but still one
wonders what on earth the XML expression is DOING all that time. It must be gazing out the window, reading the paper and
scratching itself, one assumes. No. Actually, the CPU is warming up on the task, so it involves frantic activity.

Operations involving XML can be startlingly fast in SQL Server, but this particular critter seems startlingly slow
once one gets to a three-figure number in the list that you’re turning into a relation that can then be used in SQL
Expressions. Imagine this getting loose in a production system when someone starts passing more values in the list and
the database suddenly slows down. How would you track the problem down?
<added 26th January>The big problem here is one of the expression. As pointed out by a reader of the blog, if you modify the expression slightly to
SELECT x.y.value('.','int')
AS IDs
FROM @XMLList.nodes('/list/i/text()')
AS x ( y
)
The shredding suddenly goes like a rocket</added 26th January>.
There is a second, and more wordy version of the simple XML list, The attribute-based list. Here
DECLARE @XMLlist XML
SELECT @XMLList = '<list><y i="2" /><y i="4" /><y i="6" /><y i="8" /><y i="10" /><y i="15" /><y i="17" /><y i="21" /></list>'
…to give exactly the same table ..
IDs
-----------
2
4
6
8
10
15
17
21
(8 row(s) affected)
by using a very similar syntax like this...
SELECT x.y.value('@i','int')
FROM @XMLList.nodes('list/y') AS x( y )
'Eh? What difference could this make?', you're wondering. Well, rather a lot, it turns out. Thanks to a reader comment on the first version of this blog
by wBob,
we can reveal that this version scales as well as Jeff's amazing 'split' function. I must admit to being rather
surprised at the difference in performance of the two. By using the Attribute-based list, or the modified XQuery expression, you can cheerfully pass lists
of substantial length to procedures without worrying, it seems. It reminded me of an argument I had some years ago with
a distinguished Database Developer who was advising us all not to use XML-based lists because of their performance
problems. I'd been using attribute-based lists simply because I copied Bob Beauchamin's example code in 'The
Book', and hadn't hit any problem. We couldn't convince each other. Perhaps we'd hit this performance difference.
<added 26th January>We can even improve on this. as Sviridov Konstantin has pointed out, you can tweak it to the point that it goes
around twice the speed of Jeff's TSQL split function by using syntax like this...
SELECT x.y.value('.','int')
FROM @XMLList.nodes('list/y/@i') AS x( y )
Which will give a performance comparison with the the TSQL split function. These modified XML shreds must be the fastest non-CLR
method of transferring list-based data as a variable in SQL.

</added 26th January>XML in
SQL Server seems to remind me of the old poem by Henry Wadsworth Longfellow. (1807-1882)
There was a little girl, who had a little curl
Right in the middle of her forehead,
And
when she was good, she was very, very good,
But when she was bad she was horrid.'.
Just as a comparison, here is the horrid XML performing against the first attribute-based XML Shred, and the TSQL split function

As always, one should measure everything you can, and take nothing for granted if you have to
deliver a scalable application.
Here is the simple code I used to do the metrics. The results were simply pasted into excel and
graphed.
--in order to record the timings, we prepare a log.
DECLARE @log TABLE (
Log_Id INT IDENTITY(1, 1),TheeVent VARCHAR( 2000 ),
[Values] INT, CPU FLOAT DEFAULT @@CPU_BUSY,
DateAndTime DATETIME DEFAULT GETDATE()) ;
--define working variables
DECLARE @List VARCHAR(MAX),@XMLList XML,@AttributeBasedXMLList XML,
@buildScript VARCHAR(MAX), @XMLbuildScript NVARCHAR(MAX),
@ii INT, @Values INT;
--and table variables to receive the results/relations
DECLARE @ByteBucket TABLE (TheNumeral INT)
DECLARE @ByteBucket2 TABLE (TheNumeral INT)
DECLARE @ByteBucket3 TABLE (TheNumeral INT)
DECLARE @ByteBucket4 TABLE (TheNumeral INT)
DECLARE @ByteBucket5 TABLE (TheNumeral INT)
DECLARE @ByteBucket6 TABLE (TheNumeral INT)
SET NOCOUNT ON
SELECT @Values=1 --start with one item in the list
WHILE @Values<=3000
BEGIN
--build up the list with random integers
SELECT @List='1',@ii=@Values
WHILE @ii>1
BEGIN
SELECT @List=@List+','+CONVERT(VARCHAR(3),CONVERT(INT,RAND()*100))
SELECT @ii=@ii-1 -- .. to the required length
END
--and pre-prepare the XML List and VALUES script
SELECT @XMLList='<list><i>'+REPLACE(@List,',','</i><i>')+'</i></list>'
SELECT @BuildScript='SELECT x FROM (values ('+REPLACE(@List,',','),(')+'))d(x)'
--and pre-prepare the XML List and VALUES script
SELECT @XMLBuildScript='SELECT @AttributeBasedXMLList = ( SELECT x AS "@x" FROM (values ('
+REPLACE(@List,',','),(')+'))d(x) FOR XML PATH(''y''), ROOT(''root''), TYPE )'
EXEC sp_executesql @XMLBuildScript, N'@AttributeBasedXMLList XML OUT',
@AttributeBasedXMLList OUT
--try doing the delimited list function
INSERT INTO @log (TheEvent, [Values]) SELECT 'Using Split function',@Values
INSERT INTO @ByteBucket (TheNumeral)
SELECT item FROM dbo.DelimitedSplit8K (@list,',')
--use the XML Shred trick
INSERT INTO @log (TheEvent, [Values]) SELECT 'Element-based XML',@Values
INSERT INTO @ByteBucket2 (TheNumeral)
SELECT x.y.value('.','int') AS IDs
FROM @XMLList.nodes('/list/i') AS x ( y )
--use the XML Shred trick
INSERT INTO @log (TheEvent, [Values]) SELECT 'Element-based XML (text())',@Values
INSERT INTO @ByteBucket6 (TheNumeral)
SELECT x.y.value('.','int') AS IDs
FROM @XMLList.nodes('/list/i/text()') AS x ( y )
--try the VALUES method
INSERT INTO @log (TheEvent, [Values]) SELECT 'Using Values script',@Values
INSERT INTO @ByteBucket3 (TheNumeral)
EXECUTE (@BuildScript)
--use the XML Shred trick
INSERT INTO @log (TheEvent, [Values]) SELECT 'Attribute-based XML',@Values
INSERT INTO @ByteBucket4 (TheNumeral)
SELECT x.y.value('@x','int')
FROM @AttributeBasedXMLList.nodes('root/y') AS x(y)
INSERT INTO @log (TheEvent, [Values]) SELECT 'finished',@Values
--use the XML Shred trick
INSERT INTO @log (TheEvent, [Values]) SELECT 'Attribute-based XML (@)',@Values
INSERT INTO @ByteBucket5 (TheNumeral)
SELECT x.y.value('.','int')
FROM @AttributeBasedXMLList.nodes('root/y/@x') AS x(y)
INSERT INTO @log (TheEvent, [Values]) SELECT 'finished',@Values
SELECT @Values=@Values+100
END
--Yes, we need to check that they all agree!
SELECT COUNT(*), SUM(TheNumeral) FROM @ByteBucket
UNION ALL
SELECT COUNT(*), SUM(TheNumeral) FROM @ByteBucket2
UNION ALL
SELECT COUNT(*), SUM(TheNumeral) FROM @ByteBucket3
UNION ALL
SELECT COUNT(*), SUM(TheNumeral) FROM @ByteBucket4
UNION ALL
SELECT COUNT(*), SUM(TheNumeral) FROM @ByteBucket5
UNION ALL
SELECT COUNT(*), SUM(TheNumeral) FROM @ByteBucket6
SELECT
TheStart.[Values],
MAX(CASE WHEN TheStart.TheeVent ='Using Split function'
THEN DATEDIFF( ms, TheStart.DateAndTime, Theend.DateAndTime )
ELSE 0
END) AS [Using Split function],
MAX(CASE WHEN TheStart.TheeVent ='Element-based XML'
THEN DATEDIFF( ms, TheStart.DateAnddTime, Theend.DateAndTime )
ELSE 0
END) AS [Element-based XML],
MAX(CASE WHEN TheStart.TheeVent ='Using Values script'
THEN DATEDIFF( ms, TheStart.DateAndTime, Theend.DateAndTime )
ELSE 0
END) AS [Using Values script],
MAX(CASE WHEN TheStart.TheeVent ='Attribute-based XML'
THEN DATEDIFF( ms, TheStart.DateAndTime, Theend.DateAndTime )
ELSE 0
END) AS [XML Attributes],*/
MAX(CASE WHEN TheStart.TheeVent ='Element-based XML (text())'
THEN DATEDIFF( ms, TheStart.DateAndTime, Theend.DateAndTime )
ELSE 0
END) AS [Element-based XML (text())],
MAX(CASE WHEN TheStart.TheeVent ='Attribute-based XML (@)'
THEN DATEDIFF( ms, TheStart.DateAndTime, Theend.DateAndTime )
ELSE 0
END) AS [Attribute-based XML (@)]
FROM @log TheStart
INNER JOIN @log Theend
ON Theend.Log_Id = TheStart.Log_Id + 1
WHERE TheStart.TheEvent<>'finished'
GROUP BY TheStart.[Values];
SELECT
TheStart.[Values],
MAX(CASE WHEN TheStart.TheeVent ='Using Split function'
THEN theEnd.cpu-TheStart.cpu
ELSE 0
END) AS [Using Split function],
MAX(CASE WHEN TheStart.TheeVent ='Element-based XML'
THEN theEnd.cpu-TheStart.cpu
ELSE 0
END) AS [Element-based XML],
MAX(CASE WHEN TheStart.TheeVent ='Using Values script'
THEN theEnd.cpu-TheStart.cpu
ELSE 0
END) AS [Using Values script],
MAX(CASE WHEN TheStart.TheeVent ='Attribute-based XML'
THEN theEnd.cpu-TheStart.cpu
ELSE 0
END) AS [XML Attributes],
MAX(CASE WHEN TheStart.TheeVent ='Element-based XML (text())'
THEN theEnd.cpu-TheStart.cpu
ELSE 0
END) AS [Element-based XML (text())],
MAX(CASE WHEN TheStart.TheeVent ='Attribute-based XML (@)'
THEN theEnd.cpu-TheStart.cpu
ELSE 0
END) AS [Attribute-based XML (@)]
FROM @log TheStart
INNER JOIN @log Theend
ON Theend.Log_Id = TheStart.Log_Id + 1
WHERE TheStart.TheEvent<> 'finished'
GROUP BY TheStart.[Values];