Click here to monitor SSC
  • Av rating:
  • Total votes: 60
  • Total comments: 3
Robert Sheldon

TSQL User-Defined Functions: Ten Questions You Were Too Shy To Ask

19 May 2014

SQL Server User-Defined Functions are good to use in most circumstances, but there just a few questions that rarely get asked on the forums. It is a shame, because the answers to them tend to clear up some ingrained misconceptions about functions that can lead to problems, particularly with locking and performance

The Questions

  1. “I’m creating a scalar function that will be used against a column that contains multiple NULL values. Is there a way to stop the function from being executed when a NULL is passed in as a parameter value?”
  2. “I created a table-valued function that I want to apply to a column in each row returned by the query. However, when I try to join the function to the target table in the query’s FROM clause, I receive an error. Is there an effective way to invoke the function for each row in the result set?”
  3. “I’ve heard you can use a function to parameterize a view, but I don’t see how you can incorporate a function into a view definition in such a way to support parameters. Can you explain how that is done?”
  4. “I’m creating a user-defined function and want to specify a default value for an input parameter. Is that possible?”
  5. “I created a scalar function that’s used in multiple queries, some of which can return millions of rows. Ever since I incorporated the function into the queries, performance has nosedived. Are there steps we can take to fix this?”
  6. “I want to call the GETDATE system function from within a user-defined function, but I’ve read you cannot do this. Is there a workaround that lets me use the GETDATE function?”
  7. “I want to execute a stored procedure from within a user-defined function, but I keep running into errors in the syntax. How do I call a stored procedure from within a function?”
  8. “When creating user-defined functions, I like to include the SCHEMABINDING option, but in some cases, I receive an error when creating the function, saying that I cannot schema bind the function. Any idea what’s going on?”
  9. “I’ve seen user-defined functions called in different ways in code. Is there a proper way to call a function?”
  10. “I’m working on a query whose T-SQL code I want to encapsulate and parameterize. The query will aggregate a subset of values based on an input parameter and return the aggregated value as a column in the query results. Should I create a stored procedure or a user-defined function?”

“I’m creating a scalar function that will be used against a column that contains multiple NULL values. Is there a way to stop the function from being executed when a NULL is passed in as a parameter value?”

Yes. In fact, doing so is actually a very straightforward process, at least for scalar functions. You simply include the RETURNS NULL ON NULL INPUT option in your WITH clause, as shown in the following example (option highlighted):

USE AdventureWorks2012;

GO

IF OBJECT_ID(N'dbo.fnGetTotalItems', N'FN') IS NOT NULL

DROP FUNCTION dbo.fnGetTotalItems;

GO

CREATE FUNCTION dbo.fnGetTotalItems (@OrderID INT)

RETURNS INT

WITH RETURNS NULL ON NULL INPUT,

  SCHEMABINDING AS

BEGIN

  DECLARE @TotalItems INT

  SELECT @TotalItems = SUM(OrderQty)

  FROM Sales.SalesOrderDetail

  WHERE SalesOrderID = @OrderID

  GROUP BY SalesOrderID

  RETURN @TotalItems;

END;

GO

By default, when you call a scalar function, the database engine executes the function body whether or not a NULL is passed in as a parameter value. However, by including the RETURNS NULL ON NULL INPUT option, the database engine will not execute the function body when a NULL value is passed in. For example, the following SELECT statement returns a NULL value:

SELECT dbo.fnGetTotalItems(null);

When the database engine sees the NULL input value, it simply returns NULL without processing the function body. If your function supports multiple input parameters, the database engine returns NULL if NULL is passed into any one of those parameters and does not execute the function body.

However, you cannot use the RETURNS NULL ON NULL INPUT option for a table-valued function. Because a table-value function returns a full resultset (table), it is possible for the function to return data even if a parameter value is NULL. As such, the option cannot be practically applied to a table-valued function.

“I created a table-valued function that I want to apply to a column in each row returned by the query. However, when I try to join the function to the target table in the query’s FROM clause, I receive an error. Is there an effective way to invoke the function for each row in the result set?”

There is as long as you’re using SQL Server 2005 or later. Starting with SQL Server 2005, you’ve been able to use the APPLY operator to join one or more tables to a table-valued function in order to invoke that function against each row in the resultset. Prior to SQL Server 2005, you had to come up with a complex workaround to achieve this.

The best way to understand how the operator works is to look at an example. The following T-SQL creates a table-valued function that returns the total number of items sold for each sale listed in the SalesOrderDetail table of the AdventureWorks2012 database:

USE AdventureWorks2012;

GO

IF OBJECT_ID(N'dbo.ifGetTotalItems', N'IF') IS NOT NULL

DROP FUNCTION dbo.ifGetTotalItems;

GO

CREATE FUNCTION dbo.ifGetTotalItems (@OrderID INT)

RETURNS TABLE

WITH SCHEMABINDING

AS

RETURN

(

  SELECT SUM(OrderQty) AS TotalItems

  FROM Sales.SalesOrderDetail

  WHERE SalesOrderID = @OrderID

  GROUP BY SalesOrderID

);

The function takes as an argument the SalesOrderID value that identifies the sale. In this case, the function returns only a single row with one column, but you can just as easily create a function that returns multiple rows. The simplest way to verify that the function is working as you expect is to run a simple SELECT statement that calls the function and passes in a SalesOrderID value:

SELECT TotalItems FROM dbo.ifGetTotalItems(43659);

For this example, we pass in 43659 the parameter value. The function then returns 26. However, all we’ve so far is to demonstrate that the function works as we expect. Let’s create a more complex SELECT statement that applies the function to each row returned from the SalesOrderHeader table. That’s where the APPLY operator comes in, as shown in the following example:

SELECT s.SalesOrderID, s.OrderDate, s.SalesPersonID,

  f.TotalItems

FROM Sales.SalesOrderHeader s

  CROSS APPLY dbo.ifGetTotalItems(s.SalesOrderID) f

ORDER BY SalesOrderID;

Notice that after we specify the SalesOrderHeader table in the FROM clause, we then include the CROSS APPLY keywords, following by the function, with the SalesOrderID column passed in as the parameter value.

The APPLY operator takes two forms: CROSS APPLY and OUTER APPLY. The CROSS APPLY combination returns rows from the primary table (in this case, SalesOrderHeader) only if they produce a result set from the table-valued function. The OUTER APPLY combination returns all rows from the primary table. In this case, both forms of APPLY return the same number of rows. The following table shows a partial list of the results returned by this query:

SalesOrderID

OrderDate

SalesPersonID

TotalItems

43659

2005-07-01 00:00:00.000

279

26

43660

2005-07-01 00:00:00.000

279

2

43661

2005-07-01 00:00:00.000

282

38

43662

2005-07-01 00:00:00.000

282

54

43663

2005-07-01 00:00:00.000

276

1

43664

2005-07-01 00:00:00.000

280

14

43665

2005-07-01 00:00:00.000

283

20

43666

2005-07-01 00:00:00.000

276

7

43667

2005-07-01 00:00:00.000

277

6

43668

2005-07-01 00:00:00.000

282

93

43669

2005-07-01 00:00:00.000

283

1

43670

2005-07-01 00:00:00.000

275

6

43671

2005-07-01 00:00:00.000

283

17

43672

2005-07-01 00:00:00.000

282

9

43673

2005-07-01 00:00:00.000

275

20

As you can see, the results include the TotalItems column, which is the number of items associated with that sale. In other words, we were able to apply the table-valued function to each row returned from the SalesOrderHeader table. If our function had returned multiple rows, the result set would have included that number of rows for each row returned by the function. For example, if the function always returned three rows for each SalesOrderID value, our resultset would include three times the number of rows than it currently does.

“I’ve heard you can use a function to parameterize a view, but I don’t see how you can incorporate a function into a view definition in such a way to support parameters. Can you explain how that is done?”

Using a function to parameterize a view has little to do with the view definition itself. It merely means you’re creating a function that either duplicates the logic of the view or calls the view within the function. In either case, you use a parameter to qualify the function’s SELECT statement.

For example, suppose we create the following view to retrieve data about the number of employees per job title:

USE AdventureWorks2012;

GO

IF OBJECT_ID(N'dbo.JobData', N'V') IS NOT NULL

DROP VIEW dbo.JobData;

GO

CREATE VIEW dbo.JobData

WITH SCHEMABINDING

AS

  SELECT JobTitle, COUNT(*) AS TotalEmps

  FROM HumanResources.Employee

  GROUP BY JobTitle;

GO

As you can see, the view is very straightforward. The SELECT statement groups the data by the JobTitle column and retrieves a count for each group. You can, of course, create a view that is far more complex than this one, but what we’ve done here is enough to demonstrate how this all works.

Once you’ve created the view, you can test it by running a simple SELECT statement, similar to the following:

SELECT * FROM dbo.JobData;

Not surprisingly, the statement returns all rows and columns returned by the view. The following table provides a partial list of those results.

JobTitle

TotalEmps

Accountant

2

Accounts Manager

1

Accounts Payable Specialist

2

Accounts Receivable Specialist

3

Application Specialist

4

Assistant to the Chief Financial Officer

1

Benefits Specialist

1

Buyer

9

Chief Executive Officer

1

Chief Financial Officer

1

Control Specialist

2

Database Administrator

2

Design Engineer

3

Document Control Assistant

2

Document Control Manager

1

As with any SELECT statement that retrieves data from a view, we can further refine our SELECT statement by including the logic necessary to return the results we need, as shown in the following example:

SELECT TotalEmps FROM dbo.JobData

WHERE JobTitle = 'Buyer';

In this case, we’ve merely specified a column in the SELECT list and added a WHERE clause that limits the results to those rows in which the JobTitle value equals Buyer. Now the statement returns only a value of 9 because that’s how many employees have that title.

Rather than creating a view and then qualifying the SELECT statements that call the view, we can instead create a table-valued function that incorporates the view’s logic, but also provides the ability to qualify the results through a parameter, as shown in the following example:

USE AdventureWorks2012;

GO

IF OBJECT_ID(N'dbo.GetJobData', N'IF') IS NOT NULL

DROP FUNCTION dbo.GetJobData;

GO

CREATE FUNCTION dbo.GetJobData (@title NVARCHAR(25))

RETURNS TABLE

WITH SCHEMABINDING

AS

RETURN

(

  SELECT COUNT(*) AS TotalEmps

  FROM HumanResources.Employee

  WHERE JobTitle = @title

  GROUP BY JobTitle

);

GO

Notice that the function’s SELECT statement is similar to that of the view’s except that we also include a WHERE clause that compares the JobTitle column to the @title input parameter. When you call the function, you simply pass in the job title as an argument:

SELECT * FROM dbo.GetJobData('Buyer');

As to be expected, the SELECT statement returns a value of 9. Chances are, however, you’ll want to incorporate the function in a more complex query. In the following example, we use the APPLY operator to join the Person and Employee tables to the GetJobData function:

SELECT

  p.FirstName + ' ' + p.LastName AS FullName,

  e.JobTitle,

  f.TotalEmps

FROM

  HumanResources.Employee e

    INNER JOIN Person.Person p

      ON e.BusinessEntityID = p.BusinessEntityID

    CROSS APPLY dbo.GetJobData(e.JobTitle) f;

For each row returned from the joined Person and Employee tables, the GetJobData function is applied to that row, based on the value of the JobTitle column. The results will then include a TotalEmps column, which will provide the total number of employees who share the same title as the person listed in that row. The following table provides a partial list of employees, their job titles, and the number of people who share that title.

FullName

JobTitle

TotalEmps

Ken Sánchez

Chief Executive Officer

1

Roberto Tamburello

Engineering Manager

1

Rob Walters

Senior Tool Designer

2

Gail Erickson

Design Engineer

3

Jossef Goldberg

Design Engineer

3

Ovidiu Cracium

Senior Tool Designer

2

Thierry D'Hers

Tool Designer

2

Janice Galvin

Tool Designer

2

Michael Sullivan

Senior Design Engineer

1

Sharon Salavaria

Design Engineer

3

David Bradley

Marketing Manager

1

Kevin Brown

Marketing Assistant

3

John Wood

Marketing Specialist

5

Mary Dempsey

Marketing Assistant

3

Wanida Benshoof

Marketing Assistant

3

The point to all this is that we were able to encapsulate the logic of the view and do a lot more. One issue with this approach, however, is that we have two similar sets of code that would both need to be updated if the schema changed. If we want to avoid this scenario (and simplify our function in the process), we can call the view from within the function definition, as shown in the following example:

USE AdventureWorks2012;

GO

IF OBJECT_ID(N'dbo.GetJobData', N'IF') IS NOT NULL

DROP FUNCTION dbo.GetJobData;

GO

CREATE FUNCTION dbo.GetJobData (@title NVARCHAR(25))

RETURNS TABLE

WITH SCHEMABINDING

AS

RETURN

(

  SELECT TotalEmps FROM dbo.JobData

  WHERE JobTitle = @title

);

GO

This function achieves the results as the preceding example, but instead calls the view. That way, if the view changes in a way that does not affect the function’s SELECT statement, the function will not need to be updated, and we’ve simplified the function’s code in the process. In that sense, we have truly parameterized the view.

“I’m creating a user-defined function and want to specify a default value for an input parameter. Is that possible?”

Yes, it is possible and fairly easy to do. When defining the parameter, include the default value, along with the equal sign, as shown the following example (highlighted):

USE AdventureWorks2012;

GO

IF OBJECT_ID(N'dbo.GetPersonTypeCount', N'IF') IS NOT NULL

DROP FUNCTION dbo.GetPersonTypeCount;

GO

CREATE FUNCTION dbo.GetPersonTypeCount

  (@type NCHAR(2) = 'IN')

RETURNS TABLE

WITH SCHEMABINDING

AS

RETURN

(

  SELECT COUNT(*) AS PersonCount

  FROM Person.Person

  WHERE PersonType = @type

  GROUP BY PersonType

);

GO

In this case, we’ve merely specified that the default value for the @type parameter is IN. However, if we want to use the default value when calling the view, we must specify the default keyword as the parameter value, as shown in the following example:

SELECT * FROM dbo.GetPersonTypeCount(default);

The function will now use the IN default value and return a value of 18484. That said, even though we’ve defined a default value, we can still specify that value when calling the function:

SELECT * FROM dbo.GetPersonTypeCount('in');

Or we can specify a different value, as we would if no default had been specified:

SELECT * FROM dbo.GetPersonTypeCount('em');

This time around, our SELECT statement returns a value of 273.

“I created a scalar function that’s used in multiple queries, some of which can return millions of rows. Ever since I incorporated the function into the queries, performance has nosedived. Are there steps we can take to fix this?”

Yes. Get rid of the function. Scalar functions are the bane of DBAs everywhere. The database engine has a habit of executing a scalar function for each row of data returned by the query. Even if a function can return only a dozen possible values, it might still run millions of times. You would think the query optimizer would be smarter than this. It is not.

If your query returns relatively few rows, a scalar function is usually no big deal and you can enjoy the encapsulation, parameterization, and ease of implementation that the function affords. But turn those few rows into millions and you’re suddenly faced with an abundance of extra executions, which translate into unnecessary disk I/O as well as hits on memory and processing resources. Plus, you can incur extra locking, which can further impact concurrency and performance.

To complicate matters, the estimated and actual execution plans that your queries generate are far from reliable. The true cost of the execution is often buried within the plan properties. You might even have to look to profile traces to get a more accurate picture of what the function is doing.

When possible, consider turning your scalar function into an inline table valued function. Such a function returns a table, rather than a single value (as is the case with a scalar function). However, you can create a table-valued function that returns only a single column with a single row, giving you results comparable to the scalar function. Best of all, the database engine normally executes the table-value function only once, regardless of the number of rows returned by the query, resulting in far better performance.

Assuming you can turn your scalar function into a table-value one, you would then incorporate the function into your query’s FROM clause when retrieving the data, rather than in the SELECT list. For example, suppose we were to start with a basic scalar function similar to the following:

USE AdventureWorks2012;

GO

IF OBJECT_ID(N'dbo.fnGetTotalItems', N'FN') IS NOT NULL

DROP FUNCTION dbo.fnGetTotalItems;

GO

CREATE FUNCTION dbo.fnGetTotalItems (@OrderID INT)

RETURNS INT

WITH SCHEMABINDING AS

BEGIN

  DECLARE @TotalItems INT

  SELECT @TotalItems = SUM(OrderQty)

  FROM Sales.SalesOrderDetail

  WHERE SalesOrderID = @OrderID

  GROUP BY SalesOrderID

  RETURN @TotalItems;

END;

GO

The function simply returns the total number of items associated with a sale in the SalesOrderDetail table, based on the inputted SalesOrderID value. You can then use a SELECT statement to run the function, in this case, entering a SalesOrderID value of 43659:

SELECT dbo.fnGetTotalItems(43659);

Based on the data in the sample database, the function returns a value of 26. If you were to use the function in a slightly more complex query, it might look something like the following:

SELECT SalesOrderID, OrderDate, SalesPersonID,

  dbo.fnGetTotalItems(SalesOrderID) AS TotalItems

FROM Sales.SalesOrderHeader

ORDER BY SalesOrderID;

This time around, we’re incorporating the function into the SELECT list of a query retrieving data from the SalesOrderHeader table. The following table shows a partial list of results returned by that query.

SalesOrderID

OrderDate

SalesPersonID

TotalItems

43659

2005-07-01 00:00:00.000

279

26

43660

2005-07-01 00:00:00.000

279

2

43661

2005-07-01 00:00:00.000

282

38

43662

2005-07-01 00:00:00.000

282

54

43663

2005-07-01 00:00:00.000

276

1

43664

2005-07-01 00:00:00.000

280

14

43665

2005-07-01 00:00:00.000

283

20

43666

2005-07-01 00:00:00.000

276

7

43667

2005-07-01 00:00:00.000

277

6

43668

2005-07-01 00:00:00.000

282

93

43669

2005-07-01 00:00:00.000

283

1

43670

2005-07-01 00:00:00.000

275

6

43671

2005-07-01 00:00:00.000

283

17

43672

2005-07-01 00:00:00.000

282

9

43673

2005-07-01 00:00:00.000

275

20

For each row returned, a column has been added that uses the fnGetTotalItems function to calculate the total number of items sold for each order. As a result, the database engine calls the function for each row in the resultset. If we were returning millions of rows—or even billions!—all our operations could be impacted. For that reason, it’s often worth rewriting the function as a table-valued function that returns only one value:

USE AdventureWorks2012;

GO

IF OBJECT_ID(N'dbo.ifGetTotalItems', N'IF') IS NOT NULL

DROP FUNCTION dbo.ifGetTotalItems;

GO

CREATE FUNCTION dbo.ifGetTotalItems (@OrderID INT)

RETURNS TABLE

WITH SCHEMABINDING

AS

RETURN

(

  SELECT SUM(OrderQty) AS TotalItems

  FROM Sales.SalesOrderDetail

  WHERE SalesOrderID = @OrderID

  GROUP BY SalesOrderID

);

GO

The function again takes a single argument, the SalesOrderID value, but this time returns the results as a table. That means, when we call the function, we must do so in a place in the query that accepts table expressions. For that, we can use the APPLY operator to join the SalesOrderHeader table to the function in the FROM clause:

SELECT s.SalesOrderID, s.OrderDate, s.SalesPersonID,

 f.TotalItems

FROM Sales.SalesOrderHeader s

  CROSS APPLY dbo.ifGetTotalItems(s.SalesOrderID) f

ORDER BY SalesOrderID;

The statement returns the same results as the previous SELECT statement; only this time, the database engine usually calls the function only once and then applies it to each row.

You might not always be able to easily turn your scalar function into a table-value function, in which case, you’ll probably want to take another approach. You might, for example, create a stored procedure or simply create the necessary T-SQL, without encapsulating any of the logic. Whatever approach you take, you should test and compare your queries under realistic workloads and then determine which approach is best suited to your circumstances.

“I want to call the GETDATE system function from within a user-defined function, but I’ve read you cannot do this. Is there a workaround that let’s me use the GETDATE function?”

First off, despite the plethora of articles and blog posts that state you cannot use a nondeterministic function such as GETDATE within a user-defined function, it is simply not true, at least not in SQL Server 2008 R2 and SQL Server 2012, and from what I can tell, this has been the case since SQL Server 2005. For proof, check out the TechNet article “User-Defined Functions,” or try it out yourself. You’ll find that some nondeterministic functions, such as GETDATE, can indeed be called from within a user-defined function.

But first a step back to explain what we mean to deterministic and nondeterministic functions. A deterministic function is one that always returns the same results when given the same specific set of input values. For example, the SQRT system function will always return the same square root value of the inputted number if that number is always the same.

However, a nondeterministic function will not necessarily return the same results each time it runs, even if input values are the same. For example, the GETDATE, HOST_ID and NEWID system functions might return different results each time they’re called; therefore, they’re considered nondeterministic functions.

For quite a few years, the conventional wisdom has been that SQL Server does not let you call a nondeterministic function from within a user-defined function. However, the following CREATE FUNCTION statement will run with no problem:

USE AdventureWorks2012;

GO

IF OBJECT_ID(N'dbo.GetJobData', N'IF') IS NOT NULL

DROP FUNCTION dbo.GetJobData;

GO

CREATE FUNCTION dbo.GetJobData

    (@title NVARCHAR(25), @HireDate DATETIME)

RETURNS TABLE

WITH SCHEMABINDING

AS

RETURN

(

  SELECT COUNT(*) AS TotalEmps

  FROM HumanResources.Employee

  WHERE JobTitle = @title

    AND HireDate BETWEEN @HireDate AND

      DATEADD(MM, -6, GETDATE())

  GROUP BY JobTitle

);

GO

The function determines the number of employees with the specified job title who have been hired between the specified date and current date, less six months. The function uses the GETDATE function to determine that six-month window.

Despite what you might have heard, SQL Server will create the function. Even when I included the HOST_ID function within the definition (just to test things out), SQL Server created the function. However, when I tried to incorporate the NEWID, RAND, TEXTPTR or NEWSEQUENTIALID nondeterministic functions into the function definition, I received an error message.

It turns out that SQL Server supports only some nondeterministic functions, such as GETDATE and HOST_ID. However, to be sure that GETDATE works correctly in our function, we can test it by calling the function and passing in the necessary parameter values:

SELECT * FROM dbo.GetJobData('Buyer', '2004-01-01');

In this case, the function returns only seven employees with the title of Buyer who have been hired within the specified date range. We can test the function further by using the APPLY operator to join the Employee and Person tables to the function, as in the following example:

SELECT

  p.FirstName + ' ' + p.LastName AS FullName,

  e.JobTitle,

  f.TotalEmps

FROM

  HumanResources.Employee e

    INNER JOIN Person.Person p

      ON e.BusinessEntityID = p.BusinessEntityID

       CROSS APPLY dbo.GetJobData(e.JobTitle, e.HireDate) f

ORDER BY JobTitle;

For each row returned from the joined tables, the function is applied, based on the JobTitle and HireDate values. The following table shows a partial list of results returned by the query.

FullName

JobTitle

TotalEmps

Barbara Moreland

Accountant

2

Mike Seamans

Accountant

1

David Liu

Accounts Manager

1

Karen Berg

Application Specialist

1

Ramesh Meyyappan

Application Specialist

2

Dan Bacon

Application Specialist

3

Janaina Bueno

Application Specialist

4

Mindy Martin

Benefits Specialist

1

Mikael Sandberg

Buyer

9

Arvind Rao

Buyer

8

Linda Meisner

Buyer

7

Fukiko Ogisu

Buyer

6

Gordon Hee

Buyer

5

Frank Pellow

Buyer

4

Eric Kurjan

Buyer

3

Erin Hagens

Buyer

2

Ben Miller

Buyer

1

Ken Sánchez

Chief Executive Officer

1

Laura Norman

Chief Financial Officer

1

Notice how, for the Buyer job title, the number of total employees decreases with each row. This is because the date range is shrinking with each row, as the hire date is compared to the GETDATE value (minus the six months). The point is, the GETDATE function, a built-in nondeterministic function, works fine in our user-defined function. So don’t believe everything you read.

In cases when you want to use a nondeterministic function that is not permitted, such as NEWID, RAND, TEXTPTR or NEWSEQUENTIALID, you can get around this limitation by calling the system function within a view and then calling the view from within your user-defined function. Or instead consider creating a stored procedure that achieves what your after or just going with straight T-SQL and not encapsulating the logic.

“I want to execute a stored procedure from within a user-defined function, but I keep running into errors in the syntax. How do I call a stored procedure from within a function?”

The reason you’re receiving errors is because, technically, you cannot call a stored procedure from within a user-defined function, unless it is an extended stored procedure. But even there, time is running out. Extended stored procedures have been deprecated and will eventually disappear from the SQL Server landscape.

That said, you’ll find plenty of online articles and blog posts suggesting methods for bypassing SQL Server’s built-in limitation against calling stored procedures within a function. One suggested workaround is to use an xp_cmdshell command to run a batch file that executes the stored procedure. Another suggestion is to use the OPENQUERY built-in function to connect via a linked server and then call the stored procedure.

In either case, you’re essentially creating a second process to trick SQL Server into running the stored procedure. The problem with such an approach is that you can end up tricking other SQL Server components as well, and end up in a deadlock that SQL Server cannot resolve.

SQL Server functions are specifically designed to prevent them from being able to change the state of the database in any way. Because stored procedures can make such changes, you cannot run them from within a function, just like you cannot modify the schema or the stored data. Functions essentially read data and then usually, in some way, manipulate the read data. But a function should never impact the underlying schema or data. That means you cannot create a function that inadvertently corrupts your entire database. If you need a routine to run a batch that includes stored procedures, or that modifies data or metadata, you need a stored procedure.

“When creating user-defined functions, I like to include the SCHEMABINDING option, but in some cases, I receive an error when creating the function, saying that I cannot schema bind the function. Any idea what’s going on?”

My guess is that you’re referencing a view or user-defined function within your function that is itself not schema bound. You cannot schema bind a function if the referenced object isn’t bound.

Schema binding ensures that the function you’re creating is bound to the database objects that it references. When you specify the SCHEMABINDING option, no one can modify the base object in a way that would affect the function definition. The function must first be modified or dropped to remove any dependencies before the underlying objects can be changed.

However, when your function definition references a view or other user-defined function, that object must also be schema bound before you can apply the SCHEMABINDING function to your new function, otherwise, you’re not really protecting the new function from underlying schema changes.

Let’s look at an example to get a sense of how this works. Suppose we create a basic table-valued function similar to the one shown in the following example:

USE AdventureWorks2012;

GO

IF OBJECT_ID(N'dbo.GetPersonTypeCount', N'IF') IS NOT NULL

DROP FUNCTION dbo.GetPersonTypeCount;

GO

CREATE FUNCTION dbo.GetPersonTypeCount (@type NCHAR(2))

RETURNS TABLE

WITH SCHEMABINDING

AS

RETURN

(

  SELECT COUNT(*) AS PersonCount

  FROM Person.Person

  WHERE PersonType = @type

  GROUP BY PersonType

);

GO

Notice that the function’s SELECT statement references only the Person table and its columns. You can include the SCHEMABINDING option in the WITH clause with no problem because you’re directly referencing the database objects (the table and columns). If we were to call the function, as in the following example, it would return the number of people associated with a specific type, in this case, in.

SELECT * FROM dbo.GetPersonTypeCount('in');

The SELECT statement returns a value of 18484. We can achieve the same results be creating the following view and then qualifying our SELECT statement when we call the view:

USE AdventureWorks2012;

GO

IF OBJECT_ID(N'dbo.PersonTypeCount', N'V') IS NOT NULL

DROP VIEW dbo.PersonTypeCount;

GO

CREATE VIEW dbo.PersonTypeCount

AS

  SELECT PersonType, COUNT(*) AS PersonCount

  FROM Person.Person

  GROUP BY PersonType;

GO

To return the same results as the preceding SELECT statement, we must include a WHERE clause to specify the person type, as shown in the following example:

SELECT PersonCount FROM dbo.PersonTypeCount

WHERE PersonType = 'in';

Suppose we now want to update our function to reference the view, rather than include its own object. This allows us to pass in a parameter without having to define a WHERE clause each time we call the view. The function does it for us, as shown in the following definition:

USE AdventureWorks2012;

GO

IF OBJECT_ID(N'dbo.GetPersonTypeCount', N'IF') IS NOT NULL

DROP FUNCTION dbo.GetPersonTypeCount;

GO

CREATE FUNCTION dbo.GetPersonTypeCount (@type NCHAR(2))

RETURNS TABLE

AS

RETURN

(

  SELECT PersonCount FROM dbo.PersonTypeCount

  WHERE PersonType = @type

);

GO

 

Notice in this function definition we do not include the WITH SCHEMABINDING clause, as we did the first time we created the function. If we include it, we’ll receive an error message saying that we cannot schema bind the function. We would either have to remove the clause or re-create the view to include the WITH SCHEMABINDING clause, as is the following example:

USE AdventureWorks2012;

GO

IF OBJECT_ID(N'dbo.PersonTypeCount', N'V') IS NOT NULL

DROP VIEW dbo.PersonTypeCount;

GO

CREATE VIEW dbo.PersonTypeCount

WITH SCHEMABINDING

AS

  SELECT PersonType, COUNT(*) AS PersonCount

  FROM Person.Person

  GROUP BY PersonType;

GO

After we re-create the view, we can then run our function definition again, but this time with the WITH SCHEMABINDING clause:

USE AdventureWorks2012;

GO

IF OBJECT_ID(N'dbo.GetPersonTypeCount', N'IF') IS NOT NULL

DROP FUNCTION dbo.GetPersonTypeCount;

GO

CREATE FUNCTION dbo.GetPersonTypeCount (@type NCHAR(2))

RETURNS TABLE

WITH SCHEMABINDING

AS

RETURN

(

  SELECT PersonCount FROM dbo.PersonTypeCount

  WHERE PersonType = @type

);

GO

Because we made the view schema bound, we can now do the same with the function. As a result, no schema changes can be made that will affect either the view or the function.

“I’ve seen user-defined functions called in different ways in code. Is there a proper way to call a function?”

The ways in which you can call a user-defined function depends on whether it is a scalar function or a table-valued function. For example, suppose we have the following scalar function:

USE AdventureWorks2012;

GO

IF OBJECT_ID(N'dbo.fnGetTotalItems', N'FN') IS NOT NULL

DROP FUNCTION dbo.fnGetTotalItems;

GO

CREATE FUNCTION dbo.fnGetTotalItems (@OrderID INT)

RETURNS INT

WITH SCHEMABINDING AS

BEGIN

  DECLARE @TotalItems INT

  SELECT @TotalItems = SUM(OrderQty)

  FROM Sales.SalesOrderDetail

  WHERE SalesOrderID = @OrderID

  GROUP BY SalesOrderID

  RETURN @TotalItems;

END;

GO

The function takes an input parameter based on the SalesOrderID column in the SalesOrderDetail table and returns a total number of items associated with that sale. You can call the function within a column expression in your query’s SELECT, WHERE, or HAVING clause. For example, that query can be something as simple as the following:

SELECT dbo.fnGetTotalItems(43659);

In this case, the query returns a value of 26, which means order 43659 has 26 items associated with it. The column expression in this case is simply the function and its parameter value. However, we can create a query slightly more complex:

SELECT SalesOrderID, OrderDate, SalesPersonID,

  dbo.fnGetTotalItems(SalesOrderID) AS TotalItems

FROM Sales.SalesOrderHeader

ORDER BY SalesOrderID;

Once again, the function is included as part of a column expression in the SELECT list. Only this time, we’ve assigned an alias to the column and specified the SalesOrderID column as the parameter’s value. The following table provides a partial list of the values returned by this statement.

SalesOrderID

OrderDate

SalesPersonID

TotalItems

43659

2005-07-01 00:00:00.000

279

26

43660

2005-07-01 00:00:00.000

279

2

43661

2005-07-01 00:00:00.000

282

38

43662

2005-07-01 00:00:00.000

282

54

43663

2005-07-01 00:00:00.000

276

1

43664

2005-07-01 00:00:00.000

280

14

43665

2005-07-01 00:00:00.000

283

20

43666

2005-07-01 00:00:00.000

276

7

43667

2005-07-01 00:00:00.000

277

6

43668

2005-07-01 00:00:00.000

282

93

43669

2005-07-01 00:00:00.000

283

1

43670

2005-07-01 00:00:00.000

275

6

43671

2005-07-01 00:00:00.000

283

17

43672

2005-07-01 00:00:00.000

282

9

43673

2005-07-01 00:00:00.000

275

20

Despite the differences between the preceding two SELECT statements, they’re essentially doing the same thing, in terms of calling the function as part of a column expression. However, SQL Server also lets you call a scalar function within an EXECUTE statement, as shown in the following example:

DECLARE @OrderQty INT;

SET @OrderQty = NULL;

EXEC @OrderQty = dbo.fnGetTotalItems @OrderID = 43659;

SELECT @OrderQty;

Like our first SELECT statement, the T-SQL here returns a value of 26; however, the way we get there is a bit more complicated in this example. First we declare the @OrderQty variable, set its value to NULL, and then use the EXECUTE statement to call the function and assign the returned value to the variable. Finally, we use a SELECT statement to retrieve the variable value.

Not surprisingly, the ways in which we can call a table-valued function are different because the function returns a full resultset (table). That means we must treat the function similar to how we would treat a table or view in our queries. Let’s look at an example to see how this works. The following T-SQL creates a table-valued function that contains logic similar to our previous example:

USE AdventureWorks2012;

GO

IF OBJECT_ID(N'dbo.ifGetTotalItems', N'IF') IS NOT NULL

DROP FUNCTION dbo.ifGetTotalItems;

GO

CREATE FUNCTION dbo.ifGetTotalItems (@OrderID INT)

RETURNS TABLE

WITH SCHEMABINDING

AS

RETURN

(

  SELECT SUM(OrderQty) AS TotalItems

  FROM Sales.SalesOrderDetail

  WHERE SalesOrderID = @OrderID

  GROUP BY SalesOrderID

);

GO

When calling the function, we must remember that we’re dealing with a table-like resultset. The simplest way to do this is to include the function in our query’s FROM clause, where a table expression is expected:

SELECT TotalItems FROM dbo.ifGetTotalItems(43659);

Once again, our SELECT statement returns a value of 26. However, if the function had returned more rows and more columns, those would be included as well. In this case, the SELECT statement returns whatever results the function generates.

When working with table-valued functions, you can also use the APPLY operator to join a table to the function, as shown in the following example:

SELECT s.SalesOrderID, s.OrderDate, s.SalesPersonID,

 f.TotalItems

FROM Sales.SalesOrderHeader s

  CROSS APPLY dbo.ifGetTotalItems(s.SalesOrderID) f

ORDER BY SalesOrderID;

This time around, we’re calling the function in the FROM clause (by joining it to the table) and including the function’s returned value (TotalItems) in the SELECT list. The SELECT statement will then return the same results shown with the preceding scalar function example. Be aware, however; you cannot use an EXECUTE statement to call a table-valued function. That is reserved for scalar functions only.

“I’m working on a query whose T-SQL code I want to encapsulate and parameterize. The query will aggregate a subset of values based on an input parameter and return the aggregated value as a column in the query results. Should I create a stored procedure or a user-defined function?”

Based on what you’re describing, a user-defined function is likely the way to go. SQL Server lets you call a function from within a SELECT statement, which you cannot do with a stored procedure. However, both provide a mechanism for encapsulating your T-SQL and passing in parameters. With stored procedures, though, you can define both input and output parameters. Functions are limited to input parameters. In addition, a function must return a single result, either a scalar value or a table. Stored procedures can return a single result, multiple results, or no results.

Another difference is that you can include all sorts of T-SQL statements within a stored procedure, which means you can retrieve data, modify data, create tables, delete tables, or take a variety of other actions. In a user-defined function, you’re basically limited to retrieving data, along with creating and managing local variables and calling extended stored procedures (which have been deprecated and will one day disappear altogether). Basically, you cannot take any action in a function that would modify the database state.

Also worth nothing is that you can call a function from within a stored procedure, but not the other way around (at least not without some clunky and often risky workarounds). Stored Procedures also let you incorporate TRY…CATCH error handling into your code. Functions do not.

It’s not all bad news for functions, however. One of their biggest advantages, as noted above, is the ability to call them from within your SELECT statements. Not only can you use functions in the SELECT list, but also within your WHERE and HAVING clauses. In addition, you can join a table-valued function in your FROM clause to one or more tables, usually with the help of the APPLY operator. Plus, you can use functions in your computed column and CHECK constraint definitions. Try doing that with a stored procedure.

Further Reading

  1. SQL Server Functions: The Basics
  2. SQL String User Function Workbench: part 1
  3. SQL String User Function Workbench: part 2
Robert Sheldon

Author profile:

After being dropped 35 feet from a helicopter and spending the next year recovering, Robert Sheldon left the Colorado Rockies and emergency rescue work to pursue safer and less painful interests—thus his entry into the world of technology. He is now a technical consultant and the author of numerous books, articles, and training material related to Microsoft Windows, various relational database management systems, and business intelligence design and implementation. He has also written news stories, feature articles, restaurant reviews, legal summaries, and the novel 'Dancing the River Lightly'. You can find more information at http://www.rhsheldon.com.

Search for other articles by Robert Sheldon

Rate this article:   Avg rating: from a total of 60 votes.


Poor

OK

Good

Great

Must read
Have Your Say
Do you have an opinion on this article? Then add your comment below:
You must be logged in to post to this forum

Click here to log in.


Subject: bit of history
Posted by: Joe Celko (not signed in)
Posted on: Saturday, May 24, 2014 at 9:42 AM
Message: Violating ISO-11179 rules with the "fn-" prefix comes from FORTRAN I and II. The parsers had to look at the prefix of variables to determine the data type (I to N was an integer; all other letters were floats). This simple grammar was also used to spot in-line functions. Later, versions of BASIC used the same prefix! This tradition stuck with procedural programmers even after they moved to declarative languages like SQL.

Subject: Clarification for scalar vs. TVF?
Posted by: Todd (not signed in)
Posted on: Sunday, May 25, 2014 at 5:34 PM
Message: In question 5, scalar vs TVF, you give the example of doing a CROSS APPLY against the TVF "dbo.ifGetTotalItems(s.SalesOrderID)", and state that the TVF would get called only once.

However, if there are millions of rows in the "S" table, why wouldn't the TVF get called millions of times, too (same as the scalar function)?

In cases like this, I've first created a temp table (either in the TempDB or in a table variable) of the complete set of results that the function may return, and include the join column (SalesOrderID) in it, and then just do a standard JOIN between the main table and this temp table to get the desired combined resultset.


Subject: Minor correction to "execute a stored procedure from within a UDF"
Posted by: srutzky (view profile)
Posted on: Friday, May 30, 2014 at 4:13 PM
Message: Hi Robert. Nice write up. I just wanted to point out a lesser-known caveat to not being able to EXEC Stored Procedures within T-SQL functions:

You can EXEC Stored Procedures from within SQLCLR functions using the internal / in-process connection (i.e. Context Connection = true), provided that the Stored Proc being called does not itself do any side-effecting operations.

While it is true that the optimizer cannot see everything that is happening in the SQLCLR function, it can see the SQL being submitted (i.e. the EXEC Stored Proc) AND by using the Context Connection it is part of the same process / SPID as opposed to a separate connection (e.g. xp_cmdshell and OPENQUERY / OPENROWSET, as you pointed out). So, if needing this functionality, a SQLCLR function can be used as a wrapper to a read-only Stored Procedure, and as nothing more than a wrapper, the optimizer doesn't need to look inside of the function.

 
Simple-Talk Database Delivery

DLM
Patterns & Practices Library

Visit our patterns and practices library to learn more about database lifecycle management.

Find out how to automate the process of building, testing and deploying your database changes to reduce risk and make rapid releases possible.

Get started

Phil Factor
Documenting your SQL Server Database

One of the shocks that a developer can get when starting to program in T-SQL is that there is no simple way of... Read more...

 View the blog

Top Rated

A Start with Automating Database Configuration Management
 For a number of reasons, it pays to have the up-to-date source of all the databases and servers that... Read more...

Archiving Hierarchical, Deleted Transactions Using XML
 When you delete a business transaction from the database, there are times when you might want to keep a... Read more...

Rollback and Recovery Troubleshooting; Challenges and Strategies
 What happens if your database deployment goes awry? Do you restore from a backup or snapshot and lose... Read more...

MySQL Compare: The Manual That Time Forgot, Part 1
 Although SQL Compare, for SQL Server, is one of Red Gate's best-known products, there are also 'sister'... Read more...

Highway to Database Recovery
 Discover the best backup and recovery articles on Simple-Talk, all in one place. Read more...

Most Viewed

Beginning SQL Server 2005 Reporting Services Part 1
 Steve Joubert begins an in-depth tour of SQL Server 2005 Reporting Services with a step-by-step guide... Read more...

Ten Common Database Design Mistakes
 If database design is done right, then the development, deployment and subsequent performance in... Read more...

SQL Server Index Basics
 Given the fundamental importance of indexes in databases, it always comes as a surprise how often the... Read more...

Concatenating Row Values in Transact-SQL
 It is an interesting problem in Transact SQL, for which there are a number of solutions and... Read more...

Temporary Tables in SQL Server
 Temporary tables are used by every DB developer, but they're not likely to be too adventurous with... Read more...

Why Join

Over 400,000 Microsoft professionals subscribe to the Simple-Talk technical journal. Join today, it's fast, simple, free and secure.