Using XML to pass lists as parameters in SQL Server

(Updated 14th Jan 2012, and again 26th Jan 2012)

Every so often, the question comes up on forums of how to pass a list as a parameter to a SQL procedure or function. There was a time that I used to love this question because one could spread so much happiness and gratitude by showing how to parse a comma-delimited list into a table. Jeff Moden has probably won the laurels for the ultimate list-parsing TSQL function, the amazing ‘DelimitedSplit8K’. Erland Sommarskog has permanently nailed the topic with a very complete coverage of the various methods for using lists in SQL Server on his website.. With Table-Valued parameters, of course, the necessity for having any lists in SQL Server is enormously reduced, though it still crops up.

Element-based XML seems, on the surface, to provide a built-in way of handling lists as parameters. No need for all those ancillary functions for splitting lists into tables, one might think. Yes, indeed, but be careful of the XQuery syntax that you use, as we’ll see.

 Let’s just take the very simplest example of taking a list of integers and turning it into a table of integers. One can use a simple element-based XML list based on a fragment like this…

…to give a table like this ..

..by using this TSQL expression

It isn’t as intuitive as a simple comma-delimited list to be sure, but it is bearable. In small doses, this shredding of element-based XML lists works fine, but,<added 26th January> using this particular syntax for getting the value of the element </added 26th January> one soon notices the sigh from the server as the length of the list starts to increases into four figures. Try it on a list of a hundred thousand integers and you’ll have time to eat a sandwich.

OK. Let’s do a few measurements and do a graph.

 TheFirstGraph

Jeff Moden’s Split function, scales beautifully as shown by the red line, and it is difficult to measure it, it is so quick. The XML eShred technique by contrast, using the element-based list, and this XQuery syntax, exhibits horrible scaling. This sort of curve is enough to strike terror into the developer. Just as a comparison, I did the process of taking a list and turning it into a table by creating a VALUES expression from the list. Even this took longer than Jeff’s function, presumably because of the overhead of compiling the expression. Of course, the comparison is rather unfair because this third approach , the ‘Using Values Script’ approach, does no validation, but still one wonders what on earth the XML expression is DOING all that time. It must be gazing out the window, reading the paper and scratching itself, one assumes. No. Actually, the CPU is warming up on the task, so it involves frantic activity.

TheSecondGraph

Operations involving XML can be startlingly fast in SQL Server, but this particular critter seems startlingly slow once one gets to a three-figure number in the list that you’re turning into a relation that can then be used in SQL Expressions. Imagine this getting loose in a production system when someone starts passing more values in the list and the database suddenly slows down. How would you track the problem down?

<added 26th January>The big problem here is one of the expression. As pointed out by a reader of the blog, if you modify the expression slightly to

The shredding suddenly goes like a rocket</added 26th January>.

There is a second, and more wordy version of the simple XML list, The attribute-based list. Here

…to give exactly the same table ..

by using a very similar syntax like this…

‘Eh? What difference could this make?’, you’re wondering. Well, rather a lot, it turns out. Thanks to a reader comment on the first version of this blog by wBob, we can reveal that this version scales as well as Jeff’s amazing ‘split’ function. I must admit to being rather surprised at the difference in performance of the two. By using the Attribute-based list, or the modified XQuery expression,  you can cheerfully pass lists of substantial length to procedures without worrying, it seems. It reminded me of an argument I had some years ago with a distinguished Database Developer who was advising us all not to use XML-based lists because of their performance problems. I’d been using attribute-based  lists simply because I copied Bob Beauchamin’s example code in ‘The Book’, and hadn’t hit any problem. We couldn’t convince each other. Perhaps we’d hit this performance difference.

<added 26th January>We can even improve on this. as Sviridov Konstantin has pointed out, you can tweak it to the point that it goes around twice the speed of Jeff’s TSQL split function by using syntax like this…

Which will give a performance comparison with the the TSQL split function. These modified XML shreds must be the fastest non-CLR method of  transferring list-based data as a variable in SQL.

FinalShootout

</added 26th January>XML in SQL Server seems to remind me of the old poem by Henry Wadsworth Longfellow. (1807-1882)

There was a little girl, who had a little curl
Right in the middle of her forehead,
And when she was good, she was very, very good,
But when she was bad she was horrid.’.

Just as a comparison, here is the horrid XML performing against the first attribute-based XML Shred, and the TSQL split function

TheThirdGraph

As always, one should measure everything you can, and take nothing for granted if you have to deliver a scalable application.

Here is the simple code I used to do the metrics. The results were simply pasted into excel and graphed.

  • 33964 views

  • Rate
    [Total: 2    Average: 4/5]