Click here to monitor SSC
  • Av rating:
  • Total votes: 326
  • Total comments: 49
Nigel Rivett

SQL Server 2005 Common Table Expressions

02 August 2006

Common Table Expressions (CTEs) were one of the most exciting features to be introduced with SQL Server 2005. Nigel Rivett explains what they are and how they can be used.

In my opinion, Common Table Expressions (CTEs) are one of the most exciting features to be introduced with SQL Server 2005. A CTE is a "temporary result set" that exists only within the scope of a single SQL statement. It allows access to functionality within that single SQL statement that was previously only available through use of functions, temp tables, cursors, and so on.

CTE basics

The concept behind a CTE is simplicity itself. Consider the following statement:

with MyCTE(x)
as
(select x='hello')
select x from MyCTE

This defines a CTE called MyCTE. In brackets after the as keyword is the query that defines the CTE. The subsequent query references our CTE, in this case simply returning the string "hello".

Like a derived table, a CTE lasts only for the duration of a query but, in contrast to a derived table, a CTE can be referenced multiple times in the same query. So, we now we have a way of calculating percentages and performing arithmetic using aggregates without repeating queries or using a temp table:

with MyCTE(x)
as
(
select top 10 x = id from sysobjects
)
select x, maxx = (select max(x) from MyCTE), pct =
100.0 * x / (select sum(x) from MyCTE)
from MyCTE

This returns (on my system):

x           maxx         pct
4           2137058649   2.515723270440
5           2137058649   3.144654088050
7           2137058649   4.402515723270
8           2137058649   5.031446540880
13          2137058649   8.176100628930
15          2137058649   9.433962264150
25          2137058649   15.723270440251
26          2137058649   16.352201257861
27          2137058649   16.981132075471
29          2137058649   18.238993710691

Note that although this has only referenced sysobjects once in the CTE, the query plan will confirm that sysobjects is actually scanned 3 times – each aggregate in the result set causes an additional scan. As a result, it would still be more efficient to accumulate the values in a temporary table or table variable if you are accessing large tables.

CTE and recursion

More interesting, in my opinion, is the use of recursion with CTEs. The table defined in the CTE can be referenced in the CTE itself to give a recursive expression, using union all:

with MyCTE(x)
as
(
select x = convert(varchar(8000),'hello')
union all
select x + 'a' from MyCTE where len(x) < 100
)
select x from MyCTE
order by x

The query:

select x = convert(varchar(8000),'hello')

is called the anchor member. This is executed in the first pass and will populate the CTE with the result, in this case hello. This initial CTE is repeatedly executed until the complete result set is returned. The next entry:

select x + 'a' from MyCTE where len(x) < 100

is a recursive member as it references the CTE, MyCTE. The recursive member is executed with the anchor member output to give helloa. The next pass takes helloa as input and returns helloaa, and so on so that we arrive at a CTE populated with rows as follows:

hello
helloa
helloaa
helloaaa
helloaaaa

The recursion will terminate when the recursive member produces no rows – in this case recursion stops when the length of x equals 99 – or when the recursion limit is reached (more about that later). The CTE is then output by the following statement:

select x from MyCTE
order by
x

There are a few interesting issues associated with even this simple CTE usage. Note that the anchor member populates the CTE with a varchar(8000). Had the convert not been included then you would expect the datatype of x to be defined by that of the anchor member (varchar(5)) and to give an error when trying to insert the longer recursive entries. This does indeed happen – but you would expect varchar(1000) to be fine. Not so. This still produces the error that the datatypes don't match. In fact, you should always cast the recursive member output to be the same as the anchor member:

with MyCTE(x)
as
(
select x = convert(varchar(1000),'hello')
union all
select convert(varchar(1000),x + 'a') from MyCTE
where len(x) < 100
)
select x from MyCTE
order by x

But why, then, did the code work with varchar(8000)? This looks like a bug. As far as I can tell varchar(8000) and nvarchar(4000) seem to be the only values that work. Varchar(max) and varchar(7999) give an error as do all the other values that I have tried.

Multiple anchor members

Any entry that does not reference the CTE will be considered an anchor member so we can also include multiple anchor members using a union all:

with MyCTE(x)
as
(
select x = convert(varchar(1000),'hello')
union all
select x = convert(varchar(1000),'goodbye')
union all
select convert(varchar(1000),x + 'a') from MyCTE
where len(x) < 10
)
select x from MyCTE
order by len(x), x

This adds two rows from the anchor members and hence the recursive member acts on two rows for each pass to produce the output:

hello
helloa
goodbye
helloaa
goodbyea
helloaaa
goodbyeaa
helloaaaa
goodbyeaaa
helloaaaaa

The first recursive pass produces the output "helloa" and "goodbyea". The second produces "helloaa" and "goodbyeaa". The third produces "helloaaa" and "goodbyeaaa". For subsequent passes the string containing "goodbye" is too long for the check len(x)<10 but recursion continues as long as some output is produced – so we get more entries from hello than goodbye.

Multiple recursive members

We can also include multiple recursive members to produce extra rows at each pass:

with MyCTE(x)
as
(
select x = convert(varchar(1000),'hello')
union all
select convert(varchar(1000),x + 'a') from MyCTE where len(x) < 10
union all
select convert(varchar(1000),x + 'b') from MyCTE where len(x) < 10
)
select x from MyCTE
order by len(x), x

This produces the result:

hello
helloa
hellob
helloaa
helloab
helloba
hellobb
helloaaa
helloaab
helloaba
helloabb
hellobaa
hellobab
hellobba
hellobbb
helloaaaa
….
63 rows of output.

In order to understand this output, you need to remember that the input to each pass is the output from the previous pass. On the first recursive pass the input is the row from the anchor member. This produces two rows – one from each recursive member. On the next pass the input is the two rows output from the previous pass. Each recursive member outputs two rows producing four in total. In other words, the number of rows output doubles with each pass.

We can modify the output by limiting the rows on which the recursive members act. For example:

with MyCTE(x)
as
(
select x = convert(varchar(1000),'hello')
union all
select convert(varchar(1000),x + 'a') from MyCTE
where len(x) < 10 and (len(x) = 5 or x like '%a')
union all
select convert(varchar(1000),x + 'b') from MyCTE
where len(x) < 10 and (len(x) = 5 or x like '%b')
)
select x from MyCTE
order by len(x), x

giving:

hello
helloa
hellob
helloaa
hellobb
helloaaa
hellobbb
helloaaaa
hellobbbb
helloaaaaa
hellobbbbb

Another method is to flag the recursive members:

with MyCTE(i, x)
as
(
select i = 0, x = convert(varchar(1000),'hello')
union all
select i = 1, convert(varchar(1000),x + 'a') from MyCTE
where len(x) < 10 and i in (0,1)
union all
select i = 2, convert(varchar(1000),x + 'b') from MyCTE
where len(x) < 10 and i in (0,2)
)
select x from MyCTE
order by len(x), x

This gives the same result as above. It forces each recursive member to act only on the anchor member output and on any output that it itself has produced. This is at the expense of including the redundant data column "i" in the CTE but it does make the code a lot more readable. Using a similar technique we can output a value to indicate which pass produces each row:

with MyCTE(r1, r2, i, x)
as
(
select r1 = 1, r2 = 1, i = 0, x = convert(varchar(1000),'hello')
union all
select r1 = r1 + 1, r2 = r2, i = 1, convert(varchar(1000),x + 'a') from MyCTE
where len(x) < 10 and i in (0,1)
union all
select r1 = r1, r2 = r2 + 1, i = 2, convert(varchar(1000),x + 'b') from MyCTE
where len(x) < 10 and i in (0,2)
)
select r1, r2, x from MyCTE
order by len(x), x

This returns the following, r1 giving the pass number for the first recursive member and r2 for the second:

r1          r2          x
1           1           hello
2           1           helloa
1           2           hellob
3           1           helloaa
1           3           hellobb
4           1           helloaaa
1           4           hellobbb
5           1           helloaaaa
1           5           hellobbbb
6           1           helloaaaaa
1           6           hellobbbbb

This is very useful for debugging and also for detecting how a CTE is processed and for controlling the number of passes for each recursive member. For instance:

with MyCTE(r1, r2, i, x)
as
(
select r1 = 1, r2 = 1, i = 0, x = convert(varchar(1000),'hello')
union all
select r1 = r1 + 1, r2 = r2, i = 1, convert(varchar(1000),x + 'a') from MyCTE
where i in (0,1) and R1 < 3
union all
select r1 = r1, r2 = r2 + 1, i = 2, convert(varchar(1000),x + 'b') from MyCTE
where i in (0,2) and R2 < 6
)
select r1, r2, x from MyCTE
order by len(x), x

Here I have terminated the first recursive member after two iterations and the second after five, giving:

1           1           hello
2           1           helloa
1           2           hellob
3           1           helloaa
1           3           hellobb
1           4           hellobbb
1           5           hellobbbb
1           6           hellobbbbb

Note that recursion continues until a pass produces no output – although the first recursive member terminates early recursion continues until both recursive members produce no output.

Recursion limit

In order to demonstrate the recursion limit, we start by producing a series of numbers in a CTE:

with MyCTE(i)
as
(
select i = 1
union all
select i = i + 1 from MyCTE where i < 100
)
select i
from MyCTE
order by i

This happily produces the numbers 1 to 100 (a tally table). However, try increasing the recursion limit as follows:

with MyCTE(i)
as
(
select i = 1
union all
select i = i + 1 from MyCTE where i < 1000
)
select i
from MyCTE
order by i

You will receive the following error:

"The statement terminated. The maximum recursion 100 has been exhausted before statement completion"

It sounds like there is a limit of 100 for recursion but luckily this is just the default limit. For development this is very useful; to save time and space consider setting it lower for first attempts.

The maximum number of recursions can be set via the maxrecursion option, up to a maximum of 32767 (wouldn't it be nice, though, if the error message suggested that the recursion limit should be increased?):

with MyCTE(i)
as
(
select i = 1
union all
select i = i + 1 from MyCTE where i < 1000
)
select i
from MyCTE
order by i
option (maxrecursion 1000)

Uses for Common Table Expressions

Finally, let's review some of the more interesting applications of CTEs.

Traversing a hierarchy

This is covered well in BOL and now gives SQL Server something to compete with the Oracle connect by operator.

Date Ranges

A common requirement is to aggregate entries per day (or month or year). This is easy using a group by, if there are entries for every day:

declare @Sales table (TrDate datetime, Amount money)
insert @Sales select '20060501', 200
insert @Sales select '20060501', 400
insert @Sales select '20060502', 1200

select [day] = TrDate, sum(amount) total_sales
from @sales
group by TrDate
order by TrDate

giving:

day                       total_sales
2006-05-01 00:00:00.000   600.00
2006-05-02 00:00:00.000   1200.00

However, if there are some days with no transactions the result set, rather than report zero, does not give an entry for that day. To get round this we need to left join to a tally table which includes the days on which we wish to report. With a CTE this becomes simple. For example, for a year:

declare @Sales table (TrDate datetime, Amount money)
insert @Sales select '20060501', 200
insert @Sales select '20060501', 400
insert @Sales select '20060502', 1200

;with MyCTE(d)
as
(
select d = convert(datetime,'20060101')
union all
select d = d + 1 from MyCTE where d < '20061231'
)
select [day] = d.d, sum(coalesce(s.amount,0))
from MyCTE d
left join @sales s
on s.TrDate = d.d
group by d.d
order by d.d
option (maxrecursion 1000)

Note the ";" before the CTE definition – that's just a syntax requirement if the CTE declaration is not the first statement in a batch.

Parsing CSV values

It is common to pass a CSV string in to a stored procedure as a means of passing in an array of values. Often this is turned into a table via a function. Using a CTE we can now contain this code within the stored procedure:

declare @s varchar(1000)
select @s = 'a,b,cd,ef,zzz,hello'

;with csvtbl(i,j)
as
(
select i=1, j=charindex(',',@s+',')
union all
select i=j+1, j=charindex(',',@s+',',j+1) from csvtbl
where charindex(',',@s+',',j+1) <> 0
)
select substring(@s,i,j-i)
from csvtbl

The output is as follows:

a
b
cd
ef
zzz
hello

How does this work? The anchor member, select i=1, j=charindex(',',@s+','), returns 1 and the location of the first comma. The recursive member gives the location of the first character after the comma and the location of the next comma (we append a comma to the string to get the last entry). The result set is then obtained by using these values in a substring.

In the previous example the CTE output was the start and end locations of each of the strings. We can instead produce the strings themselves; the CTE code becomes a little more complicated but the following query is simplified:

declare @s varchar(1000)
select @s = 'a,b,cd,ef,zzz,hello'

;with csvtbl(i,j, s)
as
(
select i=1, s=charindex(',',@s+','),
substring(@s, 1, charindex(',',@s+',')-1)
union all
select i=j+1, j=charindex(',',@s+',',j+1),
substring(@s, j+1, charindex(',',@s+',',j+1)-(j+1))
from csvtbl where charindex(',',@s+',',j+1) <> 0
)
select s from csvtbl

Beyond 32767

What happens if you want a list of numbers that extends beyond 32767? Although the recursion limit is 32767 it is possible to create extra entries. For example, the following CTE returns all the numbers from 0 to 64000. The result set produced is used to check the results. The maximum and count verify that all of the numbers are present, assuming that the result is a sequence:

with n(rc,i)
as
(
select rc = 1, i = 0
union all
select rc = 1, i = i + 1 from n where rc = 1 and i < 32000
union all
select rc = 2, i = i + 32001 from n where rc = 1 and i < 32000
)
select count_i = count(*), max_i = max(i)
from n
option (maxrecursion 32000)

To take this a step further we can accumulate values by manipulating the CTE:

with n (j)
as
(
select j = 0
union all
select j = j + 1 from n where j < 32000
)
select max_i = max (na.i), count_i = count(*)
from ( select i = j + k
   
from n
   
cross join
   
( select k = j * 32001
      
from n
      
where j < 32
   
) n2
) na
option (maxrecursion 32000)

giving:

max_i       count_i
----------- -------
1024031     1024032

In other words, this returns all the numbers from 0 to 1024031. How does this work? The derived table, n2, consists of all the numbers from 0 to 32 multiplied by 32001, i.e.

0
32001
64002
96003
….

This is cross-joined with the result of the CTE, n, which consists of all the numbers from 0 to 32000, and the result is the sum of the values from n and n2. Hence we get:

from n 0- 32000 with 0 from n2 to give 0 – 32000
from n 0- 32000 with 32001 from n2 to give 32001 – 64001
from n 0- 32000 with 64002 from n2 to give 64002 – 96002
from n 0- 32000 with 96003 from n2 to give 64003 – 96003
…..

to give the values 0 – 1,024,031. If more values are required then just increase the size of n2 by increasing the maximum value from 32.

Write SQL faster. As Nigel demonstrates, Common Table Expressions are one of the most powerful features to be introduced with SQL Server 2005. SQL Prompt, Red Gate's code completion tool, offers full support for CTEs, including recursive CTEs and multiple CTEs within a single WITH statement, and will increase the speed and accuracy with which you create this and other SQL code.

Nigel Rivett

Author profile:

Nigel spent his formative years working in assembler on IBM Series/1 but retrained in VB when that went out of fashion. He soon realised how little relational database expertise existed in most companies and so started to spend most of his time working on that. He now sticks to architecture, release control / IT processes, SQL Server, DTS, SSIS, and access methods in VB/ASP/.NET/Crystal Reports/reporting services. He has been involved with SQL Server for about 10 years from v4.2 to v2005, and was awarded Microsoft MVP status in 2003. He tries to stay away from anything presentation oriented (see www.mindsdoor.net). Theoretically he is semi-retired but seems to keep being offered potentially interesting work.

Search for other articles by Nigel Rivett

Rate this article:   Avg rating: from a total of 326 votes.


Poor

OK

Good

Great

Must read
Have Your Say
Do you have an opinion on this article? Then add your comment below:
You must be logged in to post to this forum

Click here to log in.


Subject: Excellent Stuff!
Posted by: Andrew Clarke (view profile)
Posted on: Wednesday, August 02, 2006 at 8:49 AM
Message: How nice to see the famous Nigel Rivett 'in harness'. It is just the sort of article I want to read. It is about time he was featured as 'Geek of the Week'.

Subject: What else can CTE do?
Posted by: Anonymous (not signed in)
Posted on: Friday, August 18, 2006 at 11:55 PM
Message: Interesting.

Is there something else beyond recursion that a CTE can do?
Does recursion with CTE any better performance wise that a controlled loop?

Subject: re What else can CTE do?
Posted by: Anonymous (not signed in)
Posted on: Wednesday, August 23, 2006 at 11:16 AM
Message: In general a CTE should be faster than a loop but it probably depends on what you are doing and how you code it.

As to other things - a CTE is just a table that can be used in the following query. That is quite useful on it's own as it does away with the need to repeat code.
The recursion bit is an extra and there are a lot of things that can be done with it by varying what is returned by the various union'd clauses.

Subject: great article
Posted by: Anonymous (not signed in)
Posted on: Saturday, August 26, 2006 at 3:49 PM
Message: I have been poking a little into CTE's. From the limited examples I have seen elsewhere, I didn't really see much that I didn't already have a reasonable approach to handle. After reading this, I'll take the time to play some more.

Subject: Great Explanations and Examples
Posted by: Anonymous (not signed in)
Posted on: Thursday, September 07, 2006 at 11:51 AM
Message: I used to use derived tables all the time. Now that we have upgraded to SQL Server 2005, I have started using CTEs and LOVE them even more than derived tables. CTEs can really help clean up a complicated query and make it easier to understand.

I've read several articles on CTEs and while I understood the theory behind a recursive CTE, I was never able to get a good understanding of what was really going on. The explanations and real-life examples in this article were just perfect for helping me grasp how to use recursive CTEs. Many thanks!

One of your examples will directly help one of my co-workers who just the other day told me about his web app which allows users to enter a comma list of values. My co-worker wanted to know if SQL Server 2005 had a way besides loops and ouside functions for converting such a list to a record set. I now have a good answer to that question.

Subject: Very good article
Posted by: Anonymous (not signed in)
Posted on: Friday, September 08, 2006 at 1:21 PM
Message: I have written som blogposts about CTE's myself. Just would like to point out that
option(maxrecursion 0) gives an unlimited number of recursions
My own samples can be found at:
http://codeinet.blogspot.com

Subject: re: recursion limit
Posted by: nigelrivett (view profile)
Posted on: Wednesday, September 13, 2006 at 10:27 AM
Message: Thanks.
You're right I should have mentioned that 0 eliminates the limit.
I don't like using it as it can get you into trouble with unexpectedly long running CTEs.

I place the blame squarely with the site editor (who is responsible for most of the quality of that article) for not pointing out the omission :).

Subject: bug?
Posted by: Anonymous (not signed in)
Posted on: Wednesday, September 20, 2006 at 5:34 AM
Message: In second example,

with MyCTE(x)as(select top 10 x = id from sysobjects)select x, maxx = (select max(x) from MyCTE), pct = 100.0 * x / (select sum(x) from MyCTE)from MyCTE

value of x is between 4 to 29. Then how come max of x is returning 2137058649.

I tried min of x and it is returning correct value as 4

Subject: bug?
Posted by: Anonymous (not signed in)
Posted on: Wednesday, September 20, 2006 at 6:24 AM
Message: In second example,

with MyCTE(x)as(select top 10 x = id from sysobjects)select x, maxx = (select max(x) from MyCTE), pct = 100.0 * x / (select sum(x) from MyCTE)from MyCTE

value of x is between 4 to 29. Then how come max of x is returning 2137058649.

I tried min of x and it is returning correct value as 4

Subject: re: bug?
Posted by: nigelrivett (view profile)
Posted on: Friday, September 22, 2006 at 5:00 PM
Message: Good spot - it is a little odd isn't it.
The clue is in my comment that it is actually accessing the underlying table multiple times.

If you look at the query plan you will see that it is returning an ordered resultset for the max (giving the max id from sysobjects) but unordered for the other accesses. It doesn't do this in master - the query plan is different there due to the way the sysobjects view is presented.

If you order the set in the CTE then the problem goes away
with MyCTE(x)as(select top 10 x = id from sysobjects order by id)
select x,
maxx = (select max(x) from MyCTE) ,
pct = 100.0 * x / (select sum(x) from MyCTE)
from MyCTE

Of course if the CTE query isn't ordered then you can't expect the result to be the same for each run - but you might expect it to be consistent for the query.

I'll mention it to the Powers That Be and see what they say.

Subject: bug - response from Microsoft
Posted by: nigelrivett (view profile)
Posted on: Saturday, September 23, 2006 at 6:48 AM
Message: Had a response from Umachandar Jayachandran.

"In SQL Server, CTEs are similar to views in that the definition or query
expression is replaced inline in the query in all references. So you don't
really get any results caching etc."

"In the standards, the WITH clause was mainly introduced to support recursive
queries. And the results of query in the WITH clause results in a virtual
table that can be used multiple times. This is different from syntax level
substitution that happens now in SQL Server."

"I guess there is surely room for improvement if you compare with the
standards. And depending on how you use CTE you can actually get confusing
results if you think of it as a temporary result set that doesn't change for
the duration of execution of the query.


So sounds like it's just something you have to remember. Except with a recursive CTE the query is included in the following statement so can be calculated multiple times - not necessarily giving the same reult every time.
The example given would be unlikely to turn up in the real world as it would probably have an order by and the problem goes away.

It is worth remembering though as there are cases which could catch you out
I can see
with a(i)
as
(select i = rand()
)
select (select i from a), (select i from a)

being used - this will produce two different rand values.
Although
with a(i)
as
(select i = rand()
)
select i, i from a

produces one. I can imagine that this depends on the optimiser though and probably isn't guaranteed. Would be nice to find a query that gave an example of that.

Subject: Re: bug - response from Microsoft
Posted by: Anonymous (not signed in)
Posted on: Wednesday, September 27, 2006 at 6:15 AM
Message:
Thanks for the reply. It seems to be clear now. :)

Subject: Excellent article
Posted by: Anonymous (not signed in)
Posted on: Thursday, October 05, 2006 at 10:15 AM
Message: I love this article and I have a question, it is possible to use a CTE do convert a table into a CSV?

Subject: Excellent article
Posted by: Anonymous (not signed in)
Posted on: Thursday, October 05, 2006 at 12:16 PM
Message: I love this article and I have a question, it is possible to use a CTE do convert a table into a CSV?

Subject: Excellent article
Posted by: Anonymous (not signed in)
Posted on: Thursday, October 05, 2006 at 4:01 PM
Message: I love this article and I have a question, it is possible to use a CTE do convert a table into a CSV?

Subject: re: convert a table into a CSV
Posted by: nigelrivett (view profile)
Posted on: Wednesday, October 11, 2006 at 6:52 AM
Message: You wouldn't use a CTE for this. It is simple to do from a query.
I'll write a brief article about how to do this shortly.

Subject: How to put CTE in store procedure
Posted by: Anonymous (not signed in)
Posted on: Monday, December 04, 2006 at 3:59 AM
Message: Hi, Thanks for the tips, but i try to put the MYCTE in store procedure but i get an error

Subject: re: How to put CTE in store procedure
Posted by: nigelrivett (view profile)
Posted on: Tuesday, December 12, 2006 at 10:14 AM
Message: Not a lot to go on here but lets guess:

You're using sql server v2000?
Your database is in v2000 compatability mode?
The preceding statement has to be terminated with a semi-colon?

Subject: Please explain
Posted by: Anonymous (not signed in)
Posted on: Friday, January 12, 2007 at 1:02 PM
Message: Hello, sir,
This is a good article.

When I try the query:
with MyCTE(x)
as
(
select x = convert(varchar(8000),'hello')
union all
select x + 'a' from MyCTE where len(x) < 100
)
select x from MyCTE

I have to give varchar(8000). But I got error when I change to varchar(7999). What is reason for it?




Subject: Sorry, please ingore my previous post
Posted by: Anonymous (not signed in)
Posted on: Friday, January 12, 2007 at 1:04 PM
Message: Sorry, please ingore my previous post

Subject: Dynamic SQL Queries
Posted by: Anonymous (not signed in)
Posted on: Wednesday, February 07, 2007 at 5:54 AM
Message: Is there any way to execute dynamic sql queries inside With Statement?

<a href="http://www.netaffiliatebase.com">http://www.netaffiliatebase.com</a>

Subject: re: Dynamic SQL Queries
Posted by: nigelrivett (view profile)
Posted on: Friday, February 09, 2007 at 5:50 PM
Message: I doubt it.
You could probably populate a temp table from dynamic sql and access that or maybe try an openquery on a self linked database.

Subject: Collapsing to CSV column
Posted by: Anonymous (not signed in)
Posted on: Friday, March 02, 2007 at 12:46 PM
Message: >You wouldn't use a CTE for this. It is simple
>to do from a query.
>I'll write a brief article about how to do this
>shortly.

I saw your article on exporting to CSV, and I think that maybe the person was asking about collapsing a set of rows (grouped by something) into a CSV list in one column.

Given a table "x" where "a" is the group by column there are mutliple rows per a with value column of "c":

declare @x table(a varchar(10), c varchar(10))
insert into @x (a,c) values ('a','m');
insert into @x (a,c) values ('a','p');
insert into @x (a,c) values ('a','o');
insert into @x (a,c) values ('b','p');
insert into @x (a,c) values ('b','q');
insert into @x (a,c) values ('c',null);
insert into @x (a,c) values ('d','x');
insert into @x (a,c) values ('e','o');
insert into @x (a,c) values ('e','q');

with ranked (a, rank, c)
as (select a, rank() over(partition by a order by c), c
from @x
where c is not null),
concat (a, rank, c)
as ( -->base set
select a, rank, cast(c as varchar(1000))
from ranked
where rank = 1
union all
-->recursive set
select ranked.a, ranked.rank, cast(concat.c + ', ' + ranked.c as varchar(1000))
from ranked
inner join concat on ranked.a = concat.a and ranked.rank -1 = concat.rank),
full_list_only (a, c)
as (select a, c
from concat
join (select a, max(rank) max_rank from ranked group by a) max_rank
on concat.a = max_rank.a and concat.rank = max_rank.max_rank)

"select * from concat" would return:
a rank c
----------------
a 1 m
b 1 p
d 1 x
e 1 o
e 2 o, q
b 2 p, q
a 2 m, o
a 3 m, o, p

"select * from full_list_only" would return:
a c
-------
d x
e o, q
b p, q
a m, o, p

I won't bother explaining it - is just follwos the same logic as the other recursive CTE examples

Subject: C# to Recursive CTE translation help
Posted by: Anonymous (not signed in)
Posted on: Sunday, March 18, 2007 at 11:50 PM
Message: I wrote an algorithm to get all the relatives of a person in a family tree. I'm basically getting all the users from the DB and am doing the recursive logic in code, so that there is only 1 call made to the DB. However, I am trying to do the same thing within a stored procedure in SQL using recursive CTEs (I think the performance might be better) but I'm finding it really tough to craft the CTE. I would really appreciate if someone could translate the following code to a recursive CTE in SQL. I tried to write the SP myself but I am not getting the right results. Here's the (incorrect) SP, and just below that is the C# code that works correctly and an explanation of my schema:

ALTER PROCEDURE [dbo].[spGetFamilyMembersForUser]
@UserName varchar(50)
AS
BEGIN

declare @familyid int
set @familyid = (select AsChildFamilyID from Users where UserName = @UserName);

WITH Ancestors (UserName, Gender, AsChildFamilyID, AsSpouseFamilyID, AsParentFamilyID) AS
(
-- Base case (get father of user)
SELECT u.UserName, u.Gender, u.AsChildFamilyID, u.AsSpouseFamilyID, u.AsParentFamilyID
FROM Users u
WHERE u.AsParentFamilyID = @familyid AND u.Gender = 'Male'

UNION

-- Base case (get mother of user)
SELECT u.UserName, u.Gender, u.AsChildFamilyID, u.AsSpouseFamilyID, u.AsParentFamilyID
FROM Users u
WHERE u.AsParentFamilyID = @familyid AND u.Gender = 'Female'


UNION ALL

-- Recursive step
SELECT u.UserName, u.Gender, u.AsChildFamilyID, u.AsSpouseFamilyID, u.AsParentFamilyID
FROM Users u
INNER JOIN Ancestors a ON a.AsChildFamilyID = u.AsParentFamilyID
WHERE u.AsChildFamilyID <> -1
)

SELECT * FROM Ancestors
WHERE Ancestors.UserName <> @UserName

END


//FYI

1. _dtAllUsers contains all users
2. _dtAllRelatives is initially empty, and is a clone of _dtAllUsers
3. The Users table has the following columns:

UserName, Gender, AsChildFamilyID, AsSpouseFamilyID, AsParentFamilyID

AsChildFamilyID means the family(ID) in which the user is a child. Similarly for AsSpouseFamilyID and AsParentFamilyID.
A value of -1 means that the user does not have that role in the family. For instance, if AsChildFamilyID is -1, then user is not a child in that family.

Example:
UserName, Gender, AsChildFamilyID, AsSpouseFamilyID, AsParentFamilyID

Joe, Male, 1, -1, -1
Jack, Male, 2, 1, 1
Jill, Female, -1, 1, 1
Jim, Male, -1, -1, 2

In this example, Jack and Jill are Joe's parents. Jim is Jack's father.

// POST-CONDITION

1. _dtAllRelatives contains all relatives for user

public void ComputeAllRelativesForUser(string userName)
{
DataTable dtUser = UserManager.GetUser(userName);

int asChildFamilyID = (int)dtUser.Rows[0]["AsChildFamilyID"];
int asSpouseFamilyID = (int)dtUser.Rows[0]["AsSpouseFamilyID"];
int asParentFamilyID = (int)dtUser.Rows[0]["AsParentFamilyID"];

string motherUserName = string.Empty; //the username of the user's mother
string fatherUserName = string.Empty;
string spouseUserName = string.Empty;
string siblingUserName = string.Empty;
string childUserName = string.Empty;

int fathersAsChildFamilyID = -1; //the AsChildFamilyID of user's father
int mothersAsChildFamilyID = -1;
int spousesAsChildFamilyID = -1;
int siblingsAsChildFamilyID = -1;
int childsAsChildFamilyID = -1;

foreach (DataRow row in _dtAllUsers.Rows)
{
//get Father
if ((int)row["AsParentFamilyID"] == asChildFamilyID && asChildFamilyID != -1 && (string)row["Gender"] == "Male")
{
_dtAllRelatives.Rows.Add(row);
fatherUserName = (string)row["UserName"];
fathersAsChildFamilyID = (int)row["AsChildFamilyID"];
}

//get Mother
if ((int)row["AsParentFamilyID"] == asChildFamilyID && asChildFamilyID != -1 && (string)row["Gender"] == "Female")
{
_dtAllRelatives.Rows.Add(row);
motherUserName = (string)row["UserName"];
mothersAsChildFamilyID = (int)row["AsChildFamilyID"];
}

//get Spouse
if ((int)row["AsSpouseFamilyID"] == asSpouseFamilyID && asSpouseFamilyID != -1)
{
_dtAllRelatives.Rows.Add(row);
spouseUserName = (string)row["UserName"];
spousesAsChildFamilyID = (int)row["AsChildFamilyID"];
}

//get Sibling
if ((int)row["AsChildFamilyID"] == asChildFamilyID && asChildFamilyID != -1)
{
_dtAllRelatives.Rows.Add(row);
siblingUserName = (string)row["UserName"];
siblingsAsChildFamilyID = (int)row["AsChildFamilyID"];
}

//get Child
if ((int)row["AsChildFamilyID"] == asParentFamilyID && asParentFamilyID != -1)
{
_dtAllRelatives.Rows.Add(row);
childUserName = (string)row["UserName"];
childsAsChildFamilyID = (int)row["AsChildFamilyID"];
}

//recursive for father
if (fathersAsChildFamilyID != -1)
ComputeAllRelativesForUser(fatherUserName);

//recursive for mother
if (mothersAsChildFamilyID != -1)
ComputeAllRelativesForUser(motherUserName);

//recursive for spouse
if (spousesAsChildFamilyID != -1)
ComputeAllRelativesForUser(spouseUserName);

//recursive for sibling
if (siblingsAsChildFamilyID != -1)
ComputeAllRelativesForUser(siblingUserName);

//recursive for child
if (childsAsChildFamilyID != -1)
ComputeAllRelativesForUser(childUserName);
}
}

Subject: CTE is really Fantastic feature in Sql-Server 2005
Posted by: Anonymous (not signed in)
Posted on: Tuesday, May 15, 2007 at 12:16 AM
Message: I have used CTE in my Project. It is very useful for linked Items.

Subject: CTE is really Fantastic feature in Sql-Server 2005
Posted by: Anonymous (not signed in)
Posted on: Tuesday, May 15, 2007 at 12:18 AM
Message: I have used CTE in my Project. It is very useful for linked Items at n level.

Subject: thank u...
Posted by: Anonymous (not signed in)
Posted on: Tuesday, May 22, 2007 at 1:53 AM
Message: nice article..it helped me a lot.
thank you very much Nigel Rivett sir..

Subject: Thanks
Posted by: Michael (view profile)
Posted on: Friday, June 08, 2007 at 3:22 PM
Message: Cool stuff!

Subject: to count number of nodes
Posted by: Anonymous (not signed in)
Posted on: Saturday, June 09, 2007 at 3:01 AM
Message: I have a table stucture like this :-

CREATE TABLE [dbo].[blogs_friend](
[id] [bigint] IDENTITY(1,1) NOT NULL,
[user_id] [uniqueidentifier] NOT NULL,
[friend_id] [uniqueidentifier] NOT NULL,
[friend_accepted] [char](1) COLLATE Latin1_General_CI_AS NOT NULL CONSTRAINT [DF_blogs_friend_friend_accepted] DEFAULT ('N')
) ON [PRIMARY]

GO
SET ANSI_PADDING OFF


I want to count the number of people one person is connected to?


i use this query to get a list of friends attached to a user

friend_id = case when user_id = @id then friend_id when friend_id = @id then user_id end
from dbo.blogs_friend
where user_id =@id or friend_id = @id

Subject: to count number of nodes
Posted by: Anonymous (not signed in)
Posted on: Saturday, June 09, 2007 at 3:11 AM
Message: I have a table stucture like this :-

CREATE TABLE [dbo].[blogs_friend](
[id] [bigint] IDENTITY(1,1) NOT NULL,
[user_id] [uniqueidentifier] NOT NULL,
[friend_id] [uniqueidentifier] NOT NULL,
[friend_accepted] [char](1) COLLATE Latin1_General_CI_AS NOT NULL CONSTRAINT [DF_blogs_friend_friend_accepted] DEFAULT ('N')
) ON [PRIMARY]

GO
SET ANSI_PADDING OFF


I want to count the number of people one person is connected to?


i use this query to get a list of friends attached to a user

friend_id = case when user_id = @id then friend_id when friend_id = @id then user_id end
from dbo.blogs_friend
where user_id =@id or friend_id = @id

Subject: re: to count number of nodes
Posted by: nigelrivett (view profile)
Posted on: Tuesday, June 12, 2007 at 10:42 AM
Message: You have a hierarchy and need to traverse it at then count the number of entries.
Traversing a hierarchy using a CTE is covered in bol (see the link in the above article) then just do a count(*) (or count distnict) on the result.

Subject: Incorrect syntax near the keyword 'with' CTE expression or xmlnamespaces
Posted by: Hari (view profile)
Posted on: Monday, June 18, 2007 at 3:52 AM
Message: Hello Nigel,
If i use an IF Exists statement before the CTE i get the abov error for some database and the same query works fine for some db. Is there some configuration change i need to do?


Subject: re: Incorrect syntax near the keyword 'with' CTE expression or xmlnamespaces
Posted by: nigelrivett (view profile)
Posted on: Thursday, June 21, 2007 at 9:46 AM
Message: You will need a ; before the cte definition.

I don't think it's possible to use a CTE in a subquery.

Are you sure it works for some databases?



Subject: Nice Article
Posted by: Badri (not signed in)
Posted on: Thursday, July 26, 2007 at 6:43 AM
Message: Its a Nice article and very understandable

Subject: thanks
Posted by: alan (view profile)
Posted on: Tuesday, August 28, 2007 at 2:47 AM
Message: A really wonderful article, thanks very much.

Subject: MAXRECURSION issue
Posted by: Anonymous (not signed in)
Posted on: Tuesday, September 25, 2007 at 7:42 AM
Message: I am not able to compile following function when i use MAXRECURSION option

it gives error message
"Incorrect syntax near the keyword 'OPTION'."
Can u please as what is the issue here.

ALTER FUNCTION [dbo].[CreateVarcharList](@sData varchar(8000), @sDelim char(1))
RETURNS TABLE
AS
RETURN
(WITH csvtbl(i,j)
AS
(
SELECT i = 1, j = CHARINDEX(@sDelim, @sData + @sDelim)
UNION ALL
SELECT i = j + 1, j = CHARINDEX(@sDelim, @sData + @sDelim, j + 1)
FROM csvtbl
WHERE CHARINDEX(@sDelim, @sData + @sDelim, j + 1) <> 0
)
SELECT CAST(SUBSTRING(@sData, i, j - i) as Varchar) AS ListValue
FROM csvtbl OPTION (MAXRECURSION 1000))

Subject: MAXRECURSION issue
Posted by: Anonymous (not signed in)
Posted on: Tuesday, September 25, 2007 at 10:54 PM
Message: I am not able to compile following function when i use MAXRECURSION option

it gives error message
"Incorrect syntax near the keyword 'OPTION'."
Can u please as what is the issue here.

ALTER FUNCTION [dbo].[CreateVarcharList](@sData varchar(8000), @sDelim char(1))
RETURNS TABLE
AS
RETURN
(WITH csvtbl(i,j)
AS
(
SELECT i = 1, j = CHARINDEX(@sDelim, @sData + @sDelim)
UNION ALL
SELECT i = j + 1, j = CHARINDEX(@sDelim, @sData + @sDelim, j + 1)
FROM csvtbl
WHERE CHARINDEX(@sDelim, @sData + @sDelim, j + 1) <> 0
)
SELECT CAST(SUBSTRING(@sData, i, j - i) as Varchar) AS ListValue
FROM csvtbl OPTION (MAXRECURSION 1000))

Subject: Can we use more than one CTEs?
Posted by: Anoop (not signed in)
Posted on: Wednesday, September 26, 2007 at 1:40 AM
Message: Can we use more than one CTEs?

for eg:

with MyCTE1(x1)as(select x1='hello')
with MyCTE2(x2)as(select x2='hello')
select x1 from MyCTE1
union
select x2 from MyCTE2




Subject: Can we use more than one CTEs?
Posted by: Anoop (not signed in)
Posted on: Wednesday, September 26, 2007 at 8:16 AM
Message: Can we use more than one CTEs?

for eg:

with MyCTE1(x1)as(select x1='hello')
with MyCTE2(x2)as(select x2='hello')
select x1 from MyCTE1
union
select x2 from MyCTE2




Subject: See you...Thanks, its great to receive such positive comments.
Posted by: Elias (not signed in)
Posted on: Friday, October 05, 2007 at 8:40 AM
Message: See you...Thanks, its great to receive such positive comments.
<a href=http://big-boobs.yuoi.biz > big boobs </a>
[url=http://buffie-the-body.yuoi.biz] buffie the body [/url] [url=http://beasiality.yuoi.biz] beastiality [/url]

Subject: CTE can be referenced within views... How to reference in Crystal Reports?
Posted by: Cores (not signed in)
Posted on: Wednesday, October 17, 2007 at 9:36 AM
Message: Thanks for excellent article and postings!!

I manually recreate views that I used for Oracle for using now in Sql Server. Then I will reference these views in Crystal Reports.
One of the views in Oracle contains CONNECT BY feature for hierarchy. I used CTE instead of CONNECT BY to solve this hierarhy problem for SQL Server. It works well, but the definition of this CTE is not stored in SQL Server metadata. How can I reference this CTE in Crystal Reports as a source?
It is said, CTE can be referenced within views, how?? Do you have examples?
Thank you!

Subject: Anonymous Comments Disabled
Posted by: Sarah Grady (view profile)
Posted on: Tuesday, October 23, 2007 at 8:32 AM
Message: Due to high volume of spamming, anonymous comments have been disabled.

Subject: create view on Recursive CTE FAILS with option
Posted by: Gopal (view profile)
Posted on: Thursday, January 31, 2008 at 9:38 AM
Message: Following is my code -

create view vw_cte as
with cte(m) as
(
select datediff(m, '1900-01-01', getdate())
union all
select m + 1 from cte where m < 2000
)
select m from cte
option (maxrecursion 0)


throws an error -
Msg 156, Level 15, State 1, Procedure vw_cte, Line 9
Incorrect syntax near the keyword 'option'.


What am I missing here?

Subject: Regarding the Types don't match between the anchor and the recursive part
Posted by: hariaspind (view profile)
Posted on: Thursday, June 18, 2009 at 2:35 PM
Message: I tried with Varchar(max) as below ,i didn't throw an error.It works fine

with MyCTE(x)
as(
select x = convert(Varchar(max),'hello')
union all
select x + 'a' from MyCTE where len(x) < 100
)
select x from MyCTE order by x

Subject: Regarding the Types don't match between the anchor and the recursive part
Posted by: hariaspind (view profile)
Posted on: Friday, June 19, 2009 at 12:49 PM
Message: I tried with Varchar(max) as below ,i didn't throw an error.It works fine

with MyCTE(x)
as(
select x = convert(Varchar(max),'hello')
union all
select x + 'a' from MyCTE where len(x) < 100
)
select x from MyCTE order by x

Subject: Doesn't seem traditional recursion
Posted by: Michael Mournier (view profile)
Posted on: Sunday, November 29, 2009 at 11:34 AM
Message: First of all congratulations to Nigel for such a detailed yet easy to grasp article on CTEs.
However as far as concept of recursion in CTEs is concerned, this does not seems 'traditional' recursion to me.
A traditional recursive function when executed keeps branching out into successive recursive calls until the leaf node (base condition) is reached. From the leaf node onwards the result is collected in reverse hierarchy until the root node is reached. This root node is the main function call. Thus 'actual work' starts only when the leaf node is reached ( or when the last recursive call is reached)

An example of this traditional recursion would a a function to calculate factorial of a number. If the function were used to calculate 5 factorial, it would execute as below

Return 5 * 4 Factorial
Return 4 * 3 Factorial
Return 3 * 2 Factorial
Return 2 * 1 Factorial
Return 1 [Leaf node reached]



However in case of CTE recursion, the base condition is executed at the root itself. Also the recursion structure look more like a spiral 'While' loop rather than a tree structure.

I hope what I say makes sense. In short though I am able to understand everything in the article, I am not able to decipher the recursion in CTE when I look at it from what I think 'traditional' recursion is.





Subject: Doesn't seem traditional recursion
Posted by: Michael Mournier (view profile)
Posted on: Sunday, November 29, 2009 at 11:36 AM
Message: First of all congratulations to Nigel for such a detailed yet easy to grasp article on CTEs.
However as far as concept of recursion in CTEs is concerned, this does not seems 'traditional' recursion to me.
A traditional recursive function when executed keeps branching out into successive recursive calls until the leaf node (base condition) is reached. From the leaf node onwards the result is collected in reverse hierarchy until the root node is reached. This root node is the main function call. Thus 'actual work' starts only when the leaf node is reached ( or when the last recursive call is reached)

An example of this traditional recursion would a a function to calculate factorial of a number. If the function were used to calculate 5 factorial, it would execute as below

Return 5 * 4 Factorial
Return 4 * 3 Factorial
Return 3 * 2 Factorial
Return 2 * 1 Factorial
Return 1 [Leaf node reached]



However in case of CTE recursion, the base condition is executed at the root itself. Also the recursion structure look more like a spiral 'While' loop rather than a tree structure.

I hope what I say makes sense. In short though I am able to understand everything in the article, I am not able to decipher the recursion in CTE when I look at it from what I think 'traditional' recursion is.





Subject: Excellent..
Posted by: psingla (view profile)
Posted on: Friday, July 23, 2010 at 3:31 AM
Message: simply superb,,
Thanks a lot,,,

Subject: limitation of CTE
Posted by: Dhaneenja (view profile)
Posted on: Sunday, October 17, 2010 at 4:48 AM
Message: hi Nigel Rivett,
thanks for sharing valuable information about CTE.i have notice some issue working with CTE. is there has any limitation working with XML. can you pls post what are the limitation of CTE.
here is the error code when you apply CTE,

----
DECLARE @XML_OLD xml

SELECT @XML = N'<root><r>' + replace('aaa.bbb.ccc.dddd.ee', '.', '</r><r>') + '</r></root>'

WITH MyCTE(col)
AS
(
SELECT r.value('.','VARCHAR(25)') AS col
FROM @XML.nodes('//root/r') AS RECORDS(r)
)

SELECT * FROM MyCTE
go
----
after execute above code it will return error.

Thanks
Tharindu Dhaneenja

 

Phil Factor
Searching for Strings in SQL Server Databases

Sometimes, you just want to do a search in a SQL Server database as if you were using a search engine like Google.... Read more...

 View the blog

Top Rated

SQL Server XML Questions You Were Too Shy To Ask
 Sometimes, XML seems a bewildering convention that offers solutions to problems that the average... Read more...

Continuous Delivery and the Database
 Continuous Delivery is fairly generally understood to be an effective way of tackling the problems of... Read more...

The SQL Server Sqlio Utility
 If, before deployment, you need to push the limits of your disk subsystem in order to determine whether... Read more...

The PoSh DBA - Reading and Filtering Errors
 DBAs regularly need to keep an eye on the error logs of all their SQL Servers, and the event logs of... Read more...

MySQL Compare: The Manual That Time Forgot, Part 1
 Although SQL Compare, for SQL Server, is one of Red Gate's best-known products, there are also 'sister'... Read more...

Most Viewed

Beginning SQL Server 2005 Reporting Services Part 1
 Steve Joubert begins an in-depth tour of SQL Server 2005 Reporting Services with a step-by-step guide... Read more...

Ten Common Database Design Mistakes
 If database design is done right, then the development, deployment and subsequent performance in... Read more...

SQL Server Index Basics
 Given the fundamental importance of indexes in databases, it always comes as a surprise how often the... Read more...

Reading and Writing Files in SQL Server using T-SQL
 SQL Server provides several "standard" techniques by which to read and write to files but, just... Read more...

Concatenating Row Values in Transact-SQL
 It is an interesting problem in Transact SQL, for which there are a number of solutions and... Read more...

Why Join

Over 400,000 Microsoft professionals subscribe to the Simple-Talk technical journal. Join today, it's fast, simple, free and secure.