Click here to monitor SSC
  • Av rating:
  • Total votes: 12
  • Total comments: 1
Joe Celko

VALUES() and Long Parameter Lists - Part II

28 October 2010

The use of the comma-separated  list of parameters to a SQL routine, that Phil Factor calls the 'comedy-limited list, is a device that makes seasoned SQL Database developers wince. The best choice of design for passing variable numbers of parameters or tuples to SQL Routines varies according to the importance to you of SQL Standards. Joe Celko discusses the pros and cons of the front-runners

In the first article of this series, we looked at some of the common ways that beginners to SQL mistakenly apply some of the techniques that served them well with a procedural language. About once a month, there is a request posted to a newsgroup or forum  asking how to pass a CSV (comma separated values) list in a string as a parameter to a store procedure. They miss their arrays and other variable-sized data structure and want to mimic them in SQL. They also have no idea what First Normal Form (1NF) and the Information Principle are and why they are the foundations of RDBMS.

Every time, those of us who give advice on newsgroups and forums see this type of question we refer to  the definitive articles at Erland Sommarskog's website:

http://www.sommarskog.se/arrays-in-sql-2005.html#manyparameters

http://www.sommarskog.se/arrays-in-sql-2008.html

The approaches to this problem can be classified as (1) Iterative algorithms in T-SQL (2) Auxiliary series table to replace looping (3) Non-SQL tools, such as XML and CLR routines, (4) long parameter list and recently (5) table valued parameters, or TVPs.

The iterative, XML and CLR solutions are purely procedural and are not really SQL. Aside from the religious grounds about declarative programming, the practical grounds of increased maintenance costs and a lack of optimization make them look ugly. And I am too old to learn CLR languages.

The use of the series table in a single query and the table-valued parameter are declarative SQL. Of those two approaches, the table-valued parameters are the best. But both of them are extra work.

As an aside, when Erlang tested the long parameter list, he got a 19ms in execution time for 2000 elements which was better than any other method that he tested. But when he ran the procedure from PHP, the total time jumped to over 550ms, which was horrible. I have no idea what PHP is doing and have had no problems with other host languages.

All but the table-valued parameter approach require that you scrub the input string, if you want data integrity. If the data is already in a table, then you know it is good to go.

Data scrubbing is hard enough in T-SQL, but when you have to worry about external language syntax differences, it is a serious problem. Quickly, which languages accept '1E2' versus '1.0E2' versus 'E2.0' for floating point notation? Which ones cast a string like '123abc' to integer 123 and which blow up?

The simplest answer is to use a long parameter list to construct lists and derived tables inside the procedure body. SQL server can handle up to 2100 parameters, which should be more than enough for practical purposes. SQL Server is actually a wimp in this regard; DB2 can pass 32K parameters. and Oracle can have 64K parameters. In practice, I have seen a list of 64 (Chessboard), 81 (Sudoku solver) and 361 (Go board) simple parameters. Most of the time, 100 parameters is a very safe upper limit.

Part I of this series listed the main advantages of the long parameter lists. These are:

  1. The code is highly portable to any Standard SQL. It can take advantage of any improvements in the basic SQL engine.
  2. The optimizer will treat the members of the long parameter list like any other parameter. They can be sniffed. You can use a RECOMPILE option and get the best performance possible each time the procedure runs.

This point gets lost on people. It means that I get the same error messages, testing and logic as any other parameter. When I use a home-made or external parser, I do not. Frankly, I have never seen anyone who used one of the other techniques write RAISERROR() calls to return the same error messages as the compiler.

The example in my first article was a procedure to create a basic Order header and its details in one procedure call. A skeleton for this insertion will use the price list and join it to the SKU codes on the order form. Here is the skeleton.

CREATE PROCEDURE OrderAndDetails

(@in_order_nbr CHAR(10),

 ..

 @in_sku_1 CHAR(15), @in_qty_1 INTEGER,

 @in_sku_2 CHAR(15), @in_qty_2 INTEGER,

 ..)

AS

BEGIN

/* order header skeleton */

INSERT INTO Orders (order_nbr, ..)

VALUES (@in_order_nbr, ..);

 

/* order details with price extension */

INSERT INTO OrderDetails

SELECT @in_order_nbr, OrderForm.sku, OrderForm.order_qty, P.list_price

 FROM (VALUES (@in_sku_1, @in_qty_1),

        (@in_sku_2, @in_qty_2),

         ..)

       AS OrderForm(sku, order_qty),

       PriceList AS P

 WHERE OrderForm.sku = P.sku;

END;

Do you see the idea? An entire program is being built from the long parameter list in a single declarative statement that uses no special features. I have used this same pattern to create routines for adding subordinates to a nested set tree given the root of the new subtree. Other options are use "COALESCE (@in_qty_x, 1)" on each row on the assumption they wanted at least one if they typed in a SKU, or to remove the SKUs that are NULL before the JOIN to the price list.

(SELECT sku, order_qty

   FROM (VALUES (@in_sku_1, COALESCE(@in_qty_1, 1)),

                (@in_sku_2, COALESCE(@in_qty_2, 1)),

                  ..)

         AS X(OrderForm(sku, order_qty)

 WHERE sku IS NOT NULL)

AS OrderForm(sku, order_qty)

But the real differ between the long parameter list and a home-made parser is in the exception handling. Here is a table of various inputs and what happens to them.

Legend:

  1. Msg 245, Level 16, State 1, Line x: Conversion failed when converting the varchar value <<string>> to data type int
  2. Msg 102, Level 15, State 1, Line x: Incorrect syntax near <<token>>

Notes:

  1. The CSV parser gives more serious errors, but the difference between level 15 and 16 is not much.
  2. CSV fails to do some valid conversions.
  3. Date testing has its own problems in CSV. The ISDATE() and the CAST() do not work the same way, so you need extra code in the CSV parser that does not depend on that function. Microsoft expects an ISO-8601 date in the string for the CAST(), but not for their proprietary ISDATE().

    SELECT ISDATE ('2010-01-01'); -- TRUE

    SELECT CAST ('2010-01-01' AS DATE); -- no problem

    SELECT ISDATE ('010-01-01')-- TRUE

    SELECT CAST ('010-01-01' AS DATE); -- Msg 241, Level 16, State 1, Line 1

    Conversion failed when converting date and/or time from character string.

     

  4. OUTPUT parameters are impossible with the CSV parser:

     CREATE PROCEDURE ShowList

    (@p1 INTEGER = NULL,

     @p2 INTEGER = NULL,

     @p3 INTEGER = NULL,

     @p4 INTEGER = NULL,

     @p5 INTEGER = NULL OUTPUT)

    AS ..;


  5. The NULL should produce an empty table only if we prune out NULLs, as it does. But the empty string should not become a zero. This is a string conversion error as far as I can tell; at least it is consistent, though.
  6. You can pass local variables with the long parameter list, but not with CSV:

    DECLARE @local_int_1 INTEGER;

    DECLARE @local_int_2 INTEGER;

     

    SET @local_int_1 = 42;

    SET @local_int_2 = POWER (@local_int_1 , 2);

     

    EXEC ShowList @local_int_1, @local_int_2;

A handy example of the technique is to insert a st of children under a parent in a Nested Set model. Let's assume we have a basic Nested Set tree table:

CREATE TABLE Tree

(node_name VARCHAR(15) NOT NULL,

 lft INTEGER NOT NULL CHECK (lft > 0) UNIQUE,

 rgt INTEGER NOT NULL CHECK (rgt > 0) UNIQUE,

  CHECK (lft < rgt));

Now add the root node to get things started:

INSERT INTO Tree VALUES ('Global', 1, 2);

The procedure shown here only does 10 children, but it can be easily extended to 100 or 1000 if needed.

CREATE PROCEDURE InsertChildrenIntoTree

(@in_root_node VARCHAR(15) = NULL,

 @in_child_01 VARCHAR(15) = NULL,

 @in_child_02 VARCHAR(15) = NULL,

 @in_child_03 VARCHAR(15) = NULL,

 @in_child_04 VARCHAR(15) = NULL,

 @in_child_05 VARCHAR(15) = NULL,

 @in_child_06 VARCHAR(15) = NULL,

 @in_child_07 VARCHAR(15) = NULL,

 @in_child_08 VARCHAR(15) = NULL,

 @in_child_09 VARCHAR(15) = NULL,

 @in_child_10 VARCHAR(15) = NULL)

AS

BEGIN

 

-- Find the parent node of the new subtree

DECLARE @local_parent_rgt INTEGER;

SET @local_parent_rgt

    = (SELECT rgt

         FROM Tree

        WHERE node_name = @in_root_node);

Put the children into kindergarten; I just had to be cute.

Notice that we can build complete rows with the VALUES() clause immediately without looping or other procedural code.

SELECT node_name,

      (lft + @local_parent_rgt -1) AS lft,

      (rgt + @local_parent_rgt -1) AS rgt

  INTO #local_kindergarten

  FROM (VALUES (@in_child_01, 1, 2),

               (@in_child_02, 3, 4),

               (@in_child_03, 5, 6),

               (@in_child_04, 7, 8),

               (@in_child_05, 9, 10),

               (@in_child_06, 11, 12),

               (@in_child_07, 13, 14),

               (@in_child_08, 15, 16),

               (@in_child_09, 17, 18),

               (@in_child_10, 19, 20))

       AS Kids (node_name, lft, rgt)

 WHERE node_name IS NOT NULL;

Use the size of the kindergarten to make a gap in the Tree

 UPDATE Tree

    SET lft = CASE WHEN lft > @local_parent_rgt

                   THEN lft + (2 * (SELECT COUNT(*) FROM #local_kindergarten))

                   ELSE lft END,

        rgt = CASE WHEN rgt >= @local_parent_rgt

                   THEN rgt + (2 * (SELECT COUNT(*) FROM #local_kindergarten))

                   ELSE lft END

  WHERE lft > @local_parent_rgt

     OR rgt >= @local_parent_rgt;

  

Insert kindergarten all at once as a set

INSERT INTO Tree (node_name, lft, rgt)

SELECT node_name, lft, rgt

  FROM #local_kindergarten;

    

/* check you work if you wish

SELECT node_name, lft, rgt FROM Tree; 

*/

END;

Here are some tests you can run to see that it works. The sibling order is the same as the parameter list order. You could also add text edits and other validation and verification code to the statement that built Kindergarten.

EXEC InsertChildrenIntoTree 'Global', 'USA', 'Canada', 'Europe', 'Asia';

 

EXEC InsertChildrenIntoTree  'USA', 'Texas', 'Georgia', 'Utah', 'New York', 'Maine', 'Alabama';

Could I have done this with a table-valued parameter? Sure. But there are down sides to that method. The code would not port because T-SQL does not agree with other vendors. The Microsoft model is that this is a type and not a structure. That is an important difference.

I have to declare the table as a type outside the procedure. In other SQL products I would declare it in the parameter list, so I can see the structure. I then have to load it outside the procedure body, then pass it to a local table inside the procedure body with the same structure.

The skeleton for this pattern is:

Create the data type

CREATE TYPE dbo.My_Table_Type AS TABLE

( .. );

Create a table using the type

DECLARE @in_my_table_parm My_Table_Type;

Load the table

INSERT INTO @in_my_table_parm(...)

VALUES (..),(..), .. (..);

Pass the table to the procedure, finally

EXEC SomeProc @my_table_parm = @in_my_table_parm, ..;

Remember that the parameter has to be READ-ONLY to prevent changing the rows of @in_my_table_parm in the procedure. You can get a quick introduction to the technique at http://www.sqlteam.com/article/sql-server-2008-table-valued-parameters.

One advantage of the table-valued parameter is that it is a full table, with constriants for keys and so forth. You check the data before it goes to the procedure in a separate step, while the long parameter list checks the data after it goes to the procedure in the EXEC statement.

Joe Celko

Author profile:

Joe Celko is one of the most widely read of all writers about SQL, and was the winner of the DBMS Magazine Reader's Choice Award four consecutive years. He is an independent consultant living in Austin, TX. He has taught SQL in the US, UK, the Nordic countries, South America and Africa.
He served 10 years on ANSI/ISO SQL Standards Committee and contributed to the SQL-89 and SQL-92 Standards.
He has written over 800 columns in the computer trade and academic press, mostly dealing with data and databases. He is the author of eight books on SQL for Morgan-Kaufmann, including the best selling SQL FOR SMARTIES.
Joe is a well-known figure on Newsgroups and Forums, and he is famous for his his dry wit. He is also interested in Science Fiction.

Search for other articles by Joe Celko

Rate this article:   Avg rating: from a total of 12 votes.


Poor

OK

Good

Great

Must read
Have Your Say
Do you have an opinion on this article? Then add your comment below:
You must be logged in to post to this forum

Click here to log in.


Subject: Sounds like a maintenance headache.
Posted by: Seven24 (view profile)
Posted on: Thursday, January 06, 2011 at 10:21 AM
Message: This solution strikes me a weak on maintenance. Suppose you allow for 100 parameters and you discover you need 150? That requires a developer to expand the parameters in the middle-tier and database code, go through unit testing and so on and push a new release of the site. Why wouldn't you simply maintain a table of values selected and populate that in combination with something specific to the user like a SessionID or user name? That allows for expansion or contraction of the parameter list without the need developer involvement every time that list count changes. It is *possible* to have a stored procedure with 1000 parameters? Sure, but that strikes me a maintenance headache. Just because you can do something does not mean you should.

 

Phil Factor
Searching for Strings in SQL Server Databases

Sometimes, you just want to do a search in a SQL Server database as if you were using a search engine like Google.... Read more...

 View the blog

Top Rated

SQL Server XML Questions You Were Too Shy To Ask
 Sometimes, XML seems a bewildering convention that offers solutions to problems that the average... Read more...

Continuous Delivery and the Database
 Continuous Delivery is fairly generally understood to be an effective way of tackling the problems of... Read more...

The SQL Server Sqlio Utility
 If, before deployment, you need to push the limits of your disk subsystem in order to determine whether... Read more...

The PoSh DBA - Reading and Filtering Errors
 DBAs regularly need to keep an eye on the error logs of all their SQL Servers, and the event logs of... Read more...

MySQL Compare: The Manual That Time Forgot, Part 1
 Although SQL Compare, for SQL Server, is one of Red Gate's best-known products, there are also 'sister'... Read more...

Most Viewed

Beginning SQL Server 2005 Reporting Services Part 1
 Steve Joubert begins an in-depth tour of SQL Server 2005 Reporting Services with a step-by-step guide... Read more...

Ten Common Database Design Mistakes
 If database design is done right, then the development, deployment and subsequent performance in... Read more...

SQL Server Index Basics
 Given the fundamental importance of indexes in databases, it always comes as a surprise how often the... Read more...

Reading and Writing Files in SQL Server using T-SQL
 SQL Server provides several "standard" techniques by which to read and write to files but, just... Read more...

Concatenating Row Values in Transact-SQL
 It is an interesting problem in Transact SQL, for which there are a number of solutions and... Read more...

Why Join

Over 400,000 Microsoft professionals subscribe to the Simple-Talk technical journal. Join today, it's fast, simple, free and secure.