Sometimes, when writing TSQL code in functions or procedures, it
is tempting to do iterations, or even worse, a cursor, when it isn't
really necessary. Cursors and iterations are both renowned for slowing
down Transact SQL Code SQL Server just isn't designed for it.
However, there is usually a way to do such operations in a set-based
way. If you do so, then your routines will run a lot faster, with speed
at least doubling. There are a lot of tricks to turning a problem that
seems to require an iterative approach into a set-based operation, and
we wish we could claim we'd invented one of them. Probably the most
useful technique involves that apparently useless entity, the 'helper'
table. This workshop will concentrate on this, because it is probably
the most widely used.
The most common Helper table you'll see is a table with nothing but the
numbers in a sequence from 1 upwards. These tables have a surprising
number of uses. Once you've understood the principles behind helper
tables, then you'll think of many more. We'll be providing several
examples where a helper table suddenly makes life easier. The objective
is to show the principles so that you'll try out something similar the
next time you have to tackle a tricky operation in TSQL.
As always, you're encouraged to load the example script into Query
Analyser or Management Studio, and experiment!
Our examples include:
Splitting Strings into table-rows, based on a specified delimiter
Encoding and decoding a string
Substituting values into a string
Extracting individual words from a string into a table
Extracting all the numbers in a string into a table
Removing all text between delimiters
Scrabble score
Moving averages
Getting the 'Week beginning' date in a table
Calculating the number of working days between dates.
Note. These examples use VARCHAR(8000) just so they compile on both
SQL Server 2000 and 2005. If you are using SQL Server 2005, you'll
probably want to change them to VARCHAR(MAX)
Before we start, we'll need a helper table of numbers. Our examples
aren't going to require high numbers, but we've parameterised the size
of the table that the routine creates
Creating the helper table
-------------------------
Here is a routine that checks to see if such a 'helper' table called
'numbers' exists, and, if not, creates it*/
'
'
)
SELECT * FROM dbo.uftNumbers('there are numbers like 34.56,
-56, 67.878, maybe34; possibly56, and a few others like <789023>')
/*
SeqNo number
----------- ------------
1 34.56
2 -56
3 67.878
4 34
5 56
6 789023
(6 row(s) affected)
*/
/*
Removing all text between delimiters
------------------------------------
This is a handy little routine for looking at the strings in HTML code,
but seems to earn its keep in a lot of other ways. You specify the
opening and closing delimiter. At the moment, only single-character
delimiters are allowed. Can anyone re-write it to allow multi-character
delimiters?*/
CREATE FUNCTION [dbo].[ufsRemoveDelimited]
(
@String VARCHAR(8000),
@OpeningDelimiter CHAR(1),
@ClosingDelimiter CHAR(1)
)
RETURNS
VARCHAR(8000)
AS
BEGIN
DECLARE @newString VARCHAR(8000)
IF @OpeningDelimiter = @ClosingDelimiter
BEGIN
RETURN NULL
END
IF @OpeningDelimiter+@ClosingDelimiter+@String IS NULL
BEGIN
RETURN @String
END
SELECT @NewString=''
SELECT @newString =@newString +SUBSTRING(@String,number,1)
FROM numbers
WHERE number<=LEN (REPLACE(@string,' ','|'))
AND
CHARINDEX (@OpeningDelimiter,@string+@OpeningDelimiter,number)
<
CHARINDEX (@ClosingDelimiter,@string+' '+@closingDelimiter,number)
AND number <> CHARINDEX (@OpeningDelimiter,@string,number)
RETURN @NewString
END
--so we can try it out with brackets
SELECT dbo.ufsRemoveDelimited(
'this will appear(but not this),)will this?(','(',')')
--or if you want to take out tags
SELECT dbo.ufsRemoveDelimited(
'this will appear
and this
and this','<','>')
--or this
SELECT dbo.ufsRemoveDelimited(
'<?xml version="1.0" encoding="us-ascii"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>A poem</title>
</head>
<body>
<div style="float left: width:300px;">
<h2>
Weather forecast
</h2>
<p>
The rain it raineth every day<br />
upon the just and unjust fellah<br />
but mainly on the just, because,<br />
the unjust pinched the just''s umbrella<br />
</p>
</div>
</body>
</html>
','<','>')
/*
Scrabble Score
--------------
And as a slightly silly example of the sort of chore that crops up
occasionally when analysing strings, character by character,
here is a way of scoring a string for Scrabble, assuming it is all
on ordinary squares!*/
CREATE FUNCTION ufiScrabbleScore
(
@String VARCHAR(100)
)
RETURNS INT
AS
BEGIN
DECLARE @Total INT
SELECT @total=0
SELECT @Total=@total+
CASE WHEN SUBSTRING(@String,number,1) NOT LIKE '[a-z]' THEN NULL
ELSE
CAST(SUBSTRING('00000000001122223333347799',
CHARINDEX(SUBSTRING(@String,number,1),
'EAIONRTLSUDGBCMPFHVWYKJXQZ')
,1) AS INT)+1 END
FROM numbers WHERE number <=LEN(@String)
RETURN @Total
END
-- and now we try it out!
SELECT dbo.ufiScrabbleScore('Quiz') --22
SELECT dbo.ufiScrabbleScore('Robyn') --10
/* Now we find out which are the highest scorers, assuming an
illegal quantity of tiles. We use the WordList from Phil's Blog
on the Fireside Fun of Decapitations.
Don't try running this without the WordList!*/
SELECT dbo.ufiScrabbleScore(word),word
FROM wordlist ORDER BY dbo.ufiScrabbleScore(word) DESC /*
49 razzamatazz
48 razzmatazz
45 pizzazz
43 quizzically
39 squeezeboxes
38 quizzical
38 psychoanalyzing
37 psychoanalyzed
37 squeezebox
..etc....
As well as slicing and dicing strings, once one has one's helper table,
suddenly time-interval based reporting becomes much easier.
Moving averages
---------------
How about moving averages? Here is a simple Select statement that gives
you the moving average (over a week) of the number of log entries for
every day for a year. This can be adapted to give weighted and
exponential moving averages over arbitrary time periods. You use this
technique for ironing out 'noise' from a graph in order to accentuate
the inderlying trend
To execute this, you will need a table to try it on
*/
DROP TABLE #cb
CREATE TABLE #cb (insertionDate datetime)--quantity
--for once, we need to iterate to shuffle the pack
DECLARE @ii INT
SELECT @ii=0
WHILE @ii<20000
BEGIN
INSERT INTO #cb(insertionDate)
SELECT DATEADD(Hour,RAND()*8760,'1 jan 2006')
SELECT @ii=@ii+1
END
--and put an index on it
CREATE CLUSTERED INDEX idxInsertionDate ON #cb(insertionDate)
/* now we can try out a moving average!
*/
SELECT start,[running average]=COUNT(*)/7
FROM
(
SELECT [order]=number,
[start]=DATEADD(DAY,number,'1 Jan 2006'),
[end]=DATEADD(DAY,number+7,'1 Jan 2006')
FROM numbers
WHERE DATEADD(DAY,number,'1 Jan 2006')
BETWEEN '1 Jan 2006' AND '1 Jan 2007')f
LEFT OUTER JOIN [#cb]
ON [#cb].insertionDate BETWEEN f.start AND f.[end]
GROUP BY start
ORDER BY start
/*
Getting the 'Week beginning' date in a table
---------------------------------------------
Here is a UDF that lists all the Mondays (or whatever you want)
between two dates*/
CREATE FUNCTION uftDatesOfWeekday
(
@Weekday VARCHAR(10),
@StartDate datetime,
@EndDate DateTime
)
RETURNS TABLE
AS
RETURN
(
SELECT
[start]=DATEADD(DAY,number-1,@StartDate)
FROM numbers
WHERE DATEADD(DAY,number-1,@StartDate)< @EndDate
AND DATENAME(dw,DATEADD(DAY,number-1,@StartDate))=@Weekday
)
/*
And you can try it out by finding how many mondays there are between
the first of January and 1st june this year. */
SELECT * FROM dbo.uftDatesOfWeekday('monday','1 Jan 2007','1 Jun 2007')
/*
Number of Working Days between two dates
----------------------------------------
Or, how about a UDF that tells you the number of working days between
two dates? (you can alter it if Saturday and Sunday are not your days
off!)
*/
CREATE FUNCTION ufiWorkingDays
(
@StartDate datetime,
@EndDate DateTime
)
RETURNS INT
AS
BEGIN
RETURN
(SELECT COUNT(*)
FROM numbers
WHERE DATEADD(DAY,number-1,@StartDate)< @EndDate
AND DATENAME(dw,DATEADD(DAY,number-1,@StartDate))
NOT IN ('saturday','sunday'))
END
---So how many working days until christmas?
SELECT dbo.ufiWorkingDays(
GETDATE(),'25 Dec '+DATENAME(YEAR,GETDATE()))
/* We can go on for ever with example of using a numeric Helper table
but we won't because most of you will have wandered off even before you
reach this point. We hope that you'll now take over and create some
more examples, try them out against iterative solutions that do the same
thing. We guarantee you'll be pleased with the result!*/