Click here to monitor SSC
  • Av rating:
  • Total votes: 150
  • Total comments: 32
Gail Shaw

Finding the causes of poor performance in SQL Server, Part 2

03 April 2009

In the first part of this series of articles, Gail showed how to find the procedures that contributed most to performance  problems.  In this final part, she shows how to use query statistics to pinpoint the queries that are causing the biggest problems,  and  then use the query execution plans to find out which operations are the root cause. With this information one can select and modify the indexes that can optimize the plans and therefore performance of the query.

In Part 1 of this 2-part series, I took a fictional  problem, in which I was the DBA responsible for maintaining an internet forum. The forum was performing poorly due to a lack of indexing. I showed how to use server-side Profiler tracing to capture the queries and stored procedures that represented a typical Server workload. By aggregating, at the stored procedure level, the execution statistics from the Profiler trace, I was able to identify the three worst-performing stored procedures.

Here, in Part 2, I'm going to take apart these procedures, and identify the queries within them that are having the biggest impact on performance, and why. Having defined the main problems, I'm going to investigate how to make the queries run faster, by implementing appropriate indexes. Finally, I'll review and report on the performance improvements observed when the changes are deployed to production.

If you'd like to work through the example, you'll need to download the WebForums database, which you can find a link to in the speech-bubble at the head of the article.

NOTE:   You may not, for a number of reasons, see the same performance values, or execution plans, as I have reported in the article. Firstly, my tests were performed on SQL 2008, and you may see different plans on other versions. Secondly, data volume may well affect the plans; my original copy of the database has 180 000 records in the Threads table and 1.2 million in the Posts table, whereas the copy provided with this article has 2500 and 75000, respectively.

The Execution Plan and Query Stats

In Part 1, the aggregated execution statistics identified the following three stored procedures, as the main contributors to the performance problems:

  1. ViewThread. This procedure is called anytime a user views a forum thread on the website. It returns the title of the thread, all of the posts as well as the details of who made each post
  2. ViewForum. This procedure is called anytime a user views the list of threads within a forum. It returns a subset of the threads within the forum, based on the PageNo parameter
  3. AddPost. This procedure is called whenever a new post to an existing thread. The procedure adds the row to the Posts table, update's the user's post count and updates the Threads table with the last modified date

The statistics from Part 1 suggest that the performance impact of the first two procedures, ViewThread and ViewForum, is far greater than AddPost, both in terms of query duration and IO load.

Therefore, I'll focus my tuning efforts on ViewThread and ViewForum. I'll investigate each one in turn, examining the query statistics and execution plans for the queries contained within. In order to do this, I'm going to load up Management studio, connect to a development system and run the procedure in question, with following three options switched on:

  1. Statistics IO – this will show me all the tables that the procedure is reading, how many times each table was scanned and how many reads were done.
  2. Statistics Time – this will show me the exact duration and CPU time of each statement within the procedure
  3. Execution plan – this will show me how the steps that the storage engine took to execute the query, and the estimated cost of each operation that comprises the plan.

NOTE: It is worth noting that, regardless of whether you view the estimated or actual execution plans, the costs are estimates. See Grant Fritchey's blog post for further details. For more general information on Execution Plans, see Grant Fritchey's e-book, my blog series on Execution plans, and the book "Inside SQL Server 2005: Query Tuning and Optimisation" by Kalen Delaney, et al.

I'll start the investigation with the worst offender, ViewThread, which had the longest duration and the highest IO.

The ViewThread Procedure

The ViewThread stored procedure is called whenever a user views a thread on the front-end web site. The procedure takes the ID of the thread that they've selected and their UserID. It updates the viewing stats for this thread, updates the user's status and then retrieves a list of the posts within the thread, along with the users who are currently viewing the thread. The full code of the procedure is shown in Listing 1:

CREATE PROCEDURE [dbo].[ViewThread] (@ThreadID INT, @UserID INT)

AS

     

  UPDATE Threads

    SET TotalViews = TotalViews + 1

    WHERE Threads.ThreadID = @ThreadID

     

  UPDATE Users

    SET CurrentStatus = 'Viewing Thread', ActiveEntity = @ThreadID

    WHERE UserID = @UserID

     

  SELECT *

    FROM Threads

      INNER JOIN Posts on Threads.ThreadID = Posts.ThreadID

      INNER JOIN Users on Posts.Poster = Users.UserID

      INNER JOIN UserLevel on Users.UserLevel = UserLevel.LevelID

    WHERE Threads.ThreadID = @ThreadID

     

  SELECT UserName, UserID

    FROMROM Users

    WHERE CurrentStatus NOT IN ('Viewing Forum', 'Viewing Search')

      AND ActiveEntity = @ThreadID

 

GO

Listing 1: The ViewThread stored procedure

I need to find out which of these queries are causing the biggest problems. In order to do this, I need to execute the procedure in a development environment, using a representative set of sample parameter values, in this case for ThreadID and UserID. These parameter values can be obtained from the Profiler trace. You'll occasionally find that a stored procedure exhibits different performance characteristics, depending on the parameter set that is supplied. In such cases, you'd need to test the stored procedure with each set of parameters that gives rise to distinct performance characteristics. In this case, however, the procedure appears to perform much the same regardless of the parameters supplied. Specifically, there is little difference in the number of reads for different parameter values, so I can pick any entry from the Profiler trace.

Figure 1 shows how I executed the stored procedure in Management Studio. First, I activate query statistics, either by typing the options, as shown, or from the Query Options dialog which is accessed from the Query menu in Management Studio. I then run the procedure within a transaction, issuing a rollback at the end. This is to ensure that I don't change any data that could cause later tests to have different performance characteristics, or even to fail altogether.


Figure 1: Executing ViewThread

Having generated the detailed execution characteristics for the procedure, I can map each set of characteristics to the individual queries within the procedure.

NOTE:
With only 4 queries, breaking down which of the IO and time statistics belongs to each query is relatively easy. With larger or more complex procedures, especially ones with control flow statements, it may not be as simple. One trick is to modify the stored procedure (in the dev environment) and add print statements before and after each query within the procedure so that it's very clear which IO and time stats are associated with which query.

Once I've done the breakdowns, this is the situation that emerges:

Query 1 (Update Thread's view count)

UPDATE Threads

  SET TotalViews = TotalViews + 1

  WHERE Threads.ThreadID = @ThreadID

IO and Time Statistics:

Table 'Threads'. Scan count 0, logical reads 3, physical reads 0

 

SQL Server Execution Times:

   CPU time = 0 ms,  elapsed time = 19 ms.

Execution Plan


Figure 2: Execution plan for "Update Thread's view count" query

Query 2 (Update User's status)

UPDATE Users

  SET CurrentStatus = 'Viewing Thread', ActiveEntity = @ThreadID

  WHERE UserID = @UserID

IO and Time Statistics:

Table 'Users'. Scan count 0, logical reads 2, physical reads 0

 

 SQL Server Execution Times:

   CPU time = 0 ms,  elapsed time = 12 ms.

 

Execution Plan


Figure 3: Execution plan for "Update User's status" query

Query 3 (Retrieve thread details)

SELECT *

    FROM Threads

      INNER JOIN Posts on Threads.ThreadID = Posts.ThreadID

      INNER JOIN Users on Posts.Poster = Users.UserID

      INNER JOIN UserLevel on Users.UserLevel = UserLevel.LevelID

    WHERE Threads.ThreadID = @ThreadID

IO and Time Statistics:

Table 'UserLevel'. Scan count 0, logical reads 6, physical reads 0
Table 'Users'. Scan count 0, logical reads 6, physical reads 0

Table 'Posts'. Scan count 1, logical reads 184012, physical reads 0

Table 'Threads'. Scan count 0, logical reads 3, physical reads 0

 

 SQL Server Execution Times:

   CPU time = 2563 ms,  elapsed time = 2615 ms.

Execution Plan


Figure 4: Execution plan for "Retrieve thread details" query

Query 4 (Retrieve Users viewing this thread)

SELECT UserName, UserID

    FROMOM Users

    WHERE CurrentStatus NOT IN ('Viewing Forum', 'Viewing Search')

      AND ActiveEntity = @ThreadID

IO and Time Statistics:

Table 'Users'. Scan count 1, logical reads 6, physical reads 0.

 

 SQL Server Execution Times:

   CPU time = 0 ms,  elapsed time = 0 ms.

Execution Plan


Figure 5: Execution plan for "Retrieve Users viewing this thread" query

It's readily apparent that the main problem with this stored procedure is that third query (Retrieve thread details). It performs over 18K reads of the Posts table, and has the longest duration and highest CPU time. Furthermore, the estimated costs shown in the execution plan indicates that it contributes 100% of the cost of executing the entire procedure.

The immediate issue I have is that this is a "SELECT *" query, so it is going to be very difficult for me to create good indexes. I pay a visit the developers to find out exactly which columns they're using from this procedure. It transpires that while they are returning all the columns, they only actually need five of them. I modify the procedure to return only those fields and, when I run it again, the execution characteristics are unchanged, but the smaller number of fields means that I'm more likely to be able to create useful indexes. I can't see any other changes that I could make to the code to make it more efficient, so I move on to take a closer look at the execution plan for this query.

My first goal is to identify and examine the highest-cost step in the plan. In this case, I don't have to look far. If I read the execution plan from right to left, the first operator that I see is a clustered index scan of the Posts table. In other words, every row of the clustered index (in this case the primary key) is being read in order to identify the rows that the query needs to return. This operation represents 100% of the cost associated with executing the query, and also explains the very high number of reads of the Posts table that we saw in the IO statistics.

So, I now know what the problem is. Later in the article, I'm going to analyse further to see if an index would be appropriate for this query and, if so, which index. However, first, I want to take a look at statistics and execution plans for the next procedure, ViewForum.

The ViewForum Procedure

The ViewForum procedure is called whenever a user views the list of threads within a forum. The procedure takes as parameters the forum ID of the forum to display, the ID of the user who made the request and the page number. The page number is used to paginate the list of threads and return just 20, as there may be a large number of threads in a forum. The first query in the procedure sets the user's status and the second returns the details of the thread for the requested page of the selected forum. The code for the procedure is shown in Listing 2:

ALTER PROCEDURE [dbo].[ViewForum] (@ForumID INT, @PageNo SMALLINT, @UserID INT)

AS

 

UPDATE Users

  SET CurrentStatus = 'Viewing Forum',

    ActiveEntity = @ForumID

  WHERE UserID = @UserID

 

SELECT * FROM (

  SELECT ForumName,

    Forums.Descr,

    Threads.ThreadID,

    Threads.Title,

    Threads.CreatedOn,

    Threads.LastModified,

    Threads.Locked,

    Threads.TotalReplies,

    Threads.TotalViews,

    Creator.UserName AS ThreadCreator,

    LastPoster.UserName AS ThreadLastPoster,

    ROW_NUMBER() OVER (ORDER BY Threads.LastModified DESC) AS ThreadNumber

  FROM Forums

    INNER JOIN Threads on Forums.ForumID = Threads.ForumID

    INNER JOIN Users Creator on Threads.CreatedBy = Creator.UserID

    INNER JOIN Users LastPoster ON Threads.LastPoster = LastPoster.UserID

  WHERE Forums.ForumID = @ForumID) sub

WHERE ThreadNumber BETWEEN @PageNo*20+1 AND (@PageNo+1)*20

Listing 2: The ViewForum Procedure

Once again, I execute the procedure in a development environment, using a representative set of sample parameter values for ForumID, PageNo and UserID, as shown in Figure 6:


Figure 6: Executing "ViewForum"

Query 1 (UPDATE User's status)

UPDATE Users

  SET CurrentStatus = 'Viewing Forum',

    ActiveEntity = @ForumID

  WHERE UserID = @UserID

IO and Time Statistics:

Table 'Users'. Scan count 0, logical reads 2, physical reads 0.

 

 SQL Server Execution Times:

   CPU time = 0 ms,  elapsed time = 32 ms.

Execution Plan


Figure 7: Execution plan for "UPDATE User's status" query

Query 2 (Retrieve list of threads)

SELECT * FROM (

  SELECT <Lots of fields, omitted for brevity>,

    ROW_NUMBER() OVER (ORDER BY Threads.LastModified DESC) AS ThreadNumber

  FROM Forums

    INNER JOIN Threads on Forums.ForumID = Threads.ForumID

    INNER JOIN Users Creator on Threads.CreatedBy = Creator.UserID

    INNER JOIN Users LastPoster ON Threads.LastPoster = LastPoster.UserID

  WHERERE Forums.ForumID = @ForumID) sub

WHERE ThreadNumber BETWEEN @PageNo*20 AND (@PageNo+1)*20-1

IO and Time Statistics:

Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0.

Table 'Threads'. Scan count 1, logical reads 3263, physical reads 0.

Table 'Users'. Scan count 2, logical reads 12, physical reads 0.

Table 'Forums'. Scan count 0, logical reads 2, physical reads 0.

 

 SQL Server Execution Times:

   CPU time = 484 ms,  elapsed time = 749 ms.

Execution plan:

Figure 8: Execution plan for "Retrieve list of threads" query

I can see immediately that the second query in the procedure is cause of the problem, as the optimizer estimates that it contributes 100% of the cost of executing the procedure, and because of its IO impact and duration. The query appears to be well-written and, as such, there are no changes that I need to make to the code before considering how indexes might help optimize the execution plan.

The execution plan for this query is a little more complex than the previous one, so I'm going to go over it in detail. Refer to figure 8 for the annotations:

  1. A clustered index scan on the Users table. This reads all of the rows in the Users table and outputs the columns UserID and UserName. There is no predicate on this scan, so all the rows in the table are returned
  2. A clustered index scan on the Threads table. This reads all of the rows in the Threads table and outputs the columns ThreadID, Title, CreatedOn, LastModified, CreatedBy, LastPoster, TotalReplies, TotalViews and Locked. There is a predicate on this seek, on the ForumID column, meaning that the rows that are read by the scan are tested to see if they match and only the ones that do so are returned. The estimated cost of this operator of 70% of that of the entire query, hence this clustered index scan is going to be the focus of my efforts to optimise this query
  3. Another clustered index scan on the Users table, again returning all the rows in the table and returning the same columns as the scan detailed in step 1 did.
  4. A hash match join joins the resultsets returned from the scans of the Users (step 1) and Threads (step 2) tables. Hash match joins are quite expensive joins, because of the memory they use to create the hash tables. If possible, I'd like to see this replaced by a less expensive join, but to do that I need to reduce the number of rows that are joined. Hash joins are the most efficient joins for large result sets. If I just added a join hint here to force it to a nested loop, I'd very likely make the query performance far, far worse.
  5. Another hash match join. This time it's joining the result of the second scan on the Users table (step 3) with the result of the hash join from step 4. For the same reasons that I listed above, I'd like to see this replaced by a less costly join, if possible
  6. A clustered index seek on the Forums table. The seek predicate is based on the ForumID column and returns just one row. I'm not going to worry about this. A clustered index seek is quite efficient and, since it only returns 1 row I'm not going to be able to make it any more efficient than it already is.
  7. A sort on the ModifiedDate column. The sort is there so that SQL can evaluate the row_number function that's listed within the SELECT clause. (ROW_NUMBER() OVER (Order By ModifiedDate). The estimated cost of this operator is 19% of the entire query. That makes it the second-highest cost operator in the entire query. Hence a secondary objective for me will be to see if there's any way I can remove the need for this sort.
  8. A nested loop join. This joins the result of the clustered index seek on the Forums table (step 6) with the output of the sort (step 7). I'm not concerned about this. The estimated operator cost is 0% and a nested loop is an efficient join when one or both of the inputs have a small number of rows, which is the case here.
  9. The Segment operator is one of the operators that calculates the row_number function.
  10. The Sequence Project is also involved in the calculation of the row_number function.
  11. The TOP operator partially implements the filter on the ThreadNumber (the name given to the calculated column based on the row_number function) that's specified in the outer query (WHERE ThreadNumber BETWEEN @PageNo*20 AND (@PageNo+1)*20-1). The Sort operation, from step 7, left the resultset ordered by the ModifiedDate column, which is the same column that's used to derive the row_number that's used in the BETWEEN predicate. This means that this BETWEEN predicate can be partially implemented using the TOP operator. Specifically it can eliminate rows beyond the upper range of the BETWEEN clause.
  12. The Filter implements the rest of the filter on the derived column ThreadNumber.

So, in summary, my primary goal for this query is to eliminate the clustered index scan on Threads (step 2), and my secondary objectives are to remove the sort (step 7) and to convert the hash joins (steps 4 and 5) to less expensive joins.

Identifying Indexes

At the end of Part 1 of this article series, I had a list of stored procedures that I needed to "fix". At this stage, I have now drilled down to a much more detailed level, and I now know the actual queries that are causing the issues, along with the specific operations being performed during the execution of these queries that are generating the heaviest workload on the server.

I'm now ready to identify the indexes that could help optimize the query execution plans, and so improve the performance of the queries. To identify potential indexes, I need to re-examine the query execution plans, focussing specifically on the operations within the execution plans that I identified as problematic, namely:

  • The clustered index scan on the Posts table – by implementing an index, we will hopefully see this replaced with more-efficient seek operation.
  • The clustered index scan on the Threads table – again, I would like to see this replaced with a more efficient seek operation.

Indexing the Posts Table

I'm going to start with the "SELECT Threads" query, from the first procedure, ViewThread. This was originally a "SELECT *" query, but I modified it based on the columns the developers said they really needed, as shown in Listing 3:

SELECT Title, PostDate, PostContent, UserName, LevelName

    FROM Threads

      INNER JOIN Posts on Threads.ThreadID = Posts.ThreadID

      INNER JOIN Users on Posts.Poster = Users.UserID

      INNER JOIN UserLevel on Users.UserLevel = UserLevel.LevelID

    WHERE Threads.ThreadID = @ThreadID

Listing 3: The modified "Retrieve thread details" query

Figure 9 shows the execution plan for this query, with the ToolTip window displayed for the problematic clustered index scan on the Posts table:


Figure 9: The ToolTips window, displaying properties of the clustered index scan

The predicate that is applied to the rows that the scan returns is on the ThreadID column. Since that's the only filter that's listed and the number of rows returned (3) is much lower than the total number of rows in the table (1.2 million), it seems like a very good candidate for a first try at an index.

CREATE NONCLUSTERED INDEX idx_Posts_ThreadID

      ON Posts (ThreadID)

With the index created, I'm going to run the query again to see the effect that it has had, both on the execution plan and on the IO statistics. Figure 10 shows the new execution plan:

Figure 10: The new execution plan for the "Retrieve thread details" query

The new set of query stats are as follows:

Table 'UserLevel'. Scan count 0, logical reads 6, physical reads 0.

Table 'Users'. Scan count 0, logical reads 6, physical reads 0.

Table 'Posts'. Scan count 1, logical reads 15, physical reads 0.

Table 'Threads'. Scan count 0, logical reads 3, physical reads 0.

 

 SQL Server Execution Times:

   CPU time = 0 ms,  elapsed time = 27 ms.

From 184012 logical reads on the Posts table, and a CPU time of 2563 ms, down to 15 logical reads and 0ms. Not bad.

Looking at the execution plan, I can see that in place of the original clustered index scan, I now have an index seek operation on my newly-created index. The seek operation is coupled with a bookmark lookup to retrieve the data for columns that need to be returned (see the Output List in Figure 9) but are not covered by the index. I could probably improve performance still further by widening the index to make it a covering index. In this case, I'm not going to do that right now, for three reasons:

  1. 27 ms duration and 15 logical reads is well within the realms of acceptable performance for this query.
  2. I know that, in most cases, this query will return less than 20 rows, making the index very selective, and so the cost of the bookmark lookups negligible
  3. One of the columns that the query uses, specifically PostContent, is a VARCHAR(MAX) column. While a VARCHAR(MAX) column can be used as one of the INCLUDE columns of an index, doing so will cause the entire content of the VARCHAR(MAX) column to be stored twice; once in the table and once in the index. Since this is the largest table in the database, that's going to increase the size of my database significantly.

I'm going to make a note that, should performance problems arise again with this query, the widening of the index can be considered. For now though, I'm going to call this query done and move onto the next one.

Indexing the Threads table

Listing 4 shows the problematic query from the ViewForum stored procedure, which causes the expensive clustered index scan on the Threads table:

SELECT * FROM (

  SELECT ForumName,

    Forums.Descr,

    Threads.ThreadID,

    Threads.Title,

    Threads.CreatedOn,

    Threads.LastModified,

    Threads.Locked,

    Threads.TotalReplies,

    Threads.TotalViews,

    Creator.UserName AS ThreadCreator,

    LastPoster.UserName AS ThreadLastPoster,

    ROW_NUMBER() OVER (ORDER BY Threads.LastModified DESC) AS ThreadNumber

  FROM Forums

    INNER JOIN Threads on Forums.ForumID = Threads.ForumID

    INNER JOIN Users Creator on Threads.CreatedBy = Creator.UserID

    INNER JOIN Users LastPoster ON Threads.LastPoster = LastPoster.UserID

  WHERE Forums.ForumID = @ForumID) sub

WHERE ThreadNumber BETWEEN @PageNo*20+1 AND (@PageNo+1)*20

Listing 4: The "Retrieve list of threads" query

Again, I'm going to start by taking a look at the properties of this clustered index, shown in Figure 11, in order to identify candidate columns for an index:


Figure 11: Properties of the clustered index scan operation on the Threads table

As a first step, I'm going to try the same trick as before, and create a non-clustered index on the ForumID column, as shown in Listing 5, since that is the column, listed in the predicate, being applied to the rows that the scan returns:

CREATE NONCLUSTERED INDEX idx_Threads_ForumID

      ON Threads (ForumID)

Listing 5: A simple, non-clustered index on the Threads table

When I re-run the query, however, I see that the execution plan is unchanged, and the index has not been used, as shown in Figure 12:


Figure 12: The index on ForumID does not affect the execution plan.

Once again, our index is not a covering index, as if does not contain all the columns used in the query. However, whereas before the query was highly selective and the index got used regardless, in this case the predicate matches 19113 rows out of a total of almost 190000, and so the index is not selective enough to be used. I could simply make the index covering by adding, as INCLUDE columns, all of the columns listed in the Output List of the clustered index scan. That would at least ensure that the index is used, but it may not be the best solution. Given that this query uses all but one of the columns in the Threads table, if I include them all, I'm going to be making the index very large, almost the same size as the table itself. It is an option, if I can't find any other way to make the query fast. However, I'm going to try a few other tricks first to see if adding any of the columns to the index key will reduce the number of rows processed by the index seek .

The execution plan shows that there's a join to the Users table on the CreatedBy column, and a join to Users again on the LastPoster column, as shown in figure 13:


Figure 13: The two joins to the Users table.

Maybe if I add the CreatedBy and LastPoster columns to the index key, it will persuade the optimizer to use it.

DROP INDEX idx_Threads_ForumID ON Threads

GO

 

CREATE NONCLUSTERED INDEX idx_Threads_ForumIDCreatedByLastPoster

      ON Threads (ForumID, CreatedBy, LastPoster)

Listing 6: Adding the CreatedBy and LastPoster columns to the non-clustered index on the Threads table

Unfortunately, when I rerun the query, I see that, once again, the execution plan is unchanged. I've still got a clustered index scan on the Threads table. Ok, so that didn't work; one more attempt before I add all the columns as INCLUDE columns.

There's another predicate on this query, namely the BETWEEN predicate on the ThreadNumber column (see Listing 4). I can't add that column to the index, because it's a computed column and, in fact, it's a row number ordered by the LastModified column. However, I might be able to get SQL to eliminate some rows earlier than it is doing currently if the LastModified column is part of the index key.

I have a second reason for considering the LastModified column for the index. There's a sort in the middle of the execution plan that sorts by LastModified (as part of the evaluation of the Row_number), as shown in Figure 14:


Figure 14: Properties of the Sort operation.

If I add LastModified to the non-clustered index on ForumID and SQL uses that index then, after the filter on ForumID is done, the rows will already be sorted by LastModified and an explicit sort won't be required.

Let's see if theory meets reality in this case. Listing 7 recreates the index using ForumID and LastModified as key columns:

DROP INDEX idx_Threads_ForumIDCreatedByLastPoster ON Threads

 

CREATE NONCLUSTERED INDEX idx_Threads_ForumIDLastModified

      ON Threads (ForumID, LastModified)

Listing 7: Recreating the index using ForumID and LastModified as key columns

When I rerun the query, I see that the execution plan has changed radically. The clustered index scan is gone, replaced by a non-clustered index seek and a rather expensive-looking key lookup, the sort is gone as it's no longer necessary and, in addition, the hash joins have been replaced by nested loop joins, as shown in Figure 15:

Figure 15: The execution plan now shows an index seek and a key lookup.

The change of the join type is due to the reduced number of rows being processed. Hash joins are efficient for joining very large numbers of rows, which is why they appeared in the previous execution plan. With an efficient index the number of rows returned is lower and hence the number of rows that the joins have to process is much reduced and so the nested loop join can be used instead. The nested loop is only efficient on small numbers of rows and hence the Query Optimiser will never select it when the number of rows being joined is very high.

The new query stats are as follows:

Table 'Users'. Scan count 0, logical reads 720, physical reads 0.

Table 'Threads'. Scan count 1, logical reads 564, physical reads 0.

Table 'Forums'. Scan count 0, logical reads 2, physical reads 0.

 

 SQL Server Execution Times:

   CPU time = 0 ms,  elapsed time = 1 ms.

That's better. I've reduced the logical reads on the Threads table from 3263 to 564, although I have also increased the logical reads on the Users table from 12 to 720. However, I have also reduced the CPU and elapsed times to virtually zero.

Since the logical reads are still rather high, and this is the most-frequently run query in the system, in this case I am going to add the rest of the columns used by the query as INCLUDE columns to this index, as shown in Listing 8:

DROP INDEX idx_Threads_ForumIDLastModified ON Threads

 

CREATE NONCLUSTERED INDEX idx_Threads_ForumIDLastModified

      ON Threads (ForumID, LastModified)

      INCLUDE (Title, CreatedOn, TotalReplies, TotalViews, Locked, CreatedBy, LastPoster)

Listing 8: Adding the INCLUDE  columns to the index

Figure 16 shows the final execution plan for this query:

Figure 16: The final execution plan.

And here is the final set of query stats:

Table 'Users'. Scan count 0, logical reads 720, physical reads 0.

Table 'Threads'. Scan count 1, logical reads 8, physical reads 0.

Table 'Forums'. Scan count 0, logical reads 2, physical reads 0.

 

 SQL Server Execution Times:

   CPU time = 0 ms,  elapsed time = 0 ms.

That's much better. I've reduced the IOs on the Threads table to 8. The IOs on the Users table are still at 720 but I don't think that will be a major problem. There's not much I can do about it, because the two operators that read from the Users table are both clustered index seeks. The high number of reads is because, in both cases, Users is the inner table of a nested loop join and hence the seeks on that table must be executed once for each row returned from the Threads table. I'm going to leave this as it is for now, if there's a problem in future I'll probably have to rewrite the query as there's no more I can do on the indexing side to reduce the reads further.

I've optimised everything I set out to optimise and the queries are looking pretty good when run by themselves in dev, so now I need to take them through testing and get them deployed to the production environment.

Post-implementation Performance Review

Once I've implemented those index and procedure changes in production, I need to capture a new Profiler trace, to see what kind of effect, if any, they've had on the overall server performance.

In order to this, I'm going to rerun the trace that I initially used to find the server's workload (see Part 1), import it into a database table and re-analyse the data. To ensure that the comparison is meaningful, I'm going to run the trace at the same time I did before, and with the same trace duration.

The new aggregated execution statistics are shown in Figure 17:

Figure 17: Aggregating execution statistics by stored procedure

Not only has the time impact and IO impact dropped significantly compared to the initial measurements, but also the execution count has more than tripled, indicating that the application has a far greater throughput after the optimisation than it did before. As a bonus, the procedure AddPost, which I decided not to look at, is running significantly faster as well. This is probably due to a decrease in blocking. Since the queries that I optimised read from the Posts table and are now much more efficient, the locks they take will be held for a shorter time. That means that the procedure AddPosts that writes to that table will have to wait less time to obtain a lock and hence will run faster.

The following table summarizes the effect of the indexing neatly:

 

Procedure

Time Impact

IO Impact

CPU Impact

Execution Count

Avg. Duration (ms)

Avg. IOs

Avg. CPU

Before

ViewThread

1815846

442711386

1064111

2377

763

186247

447

ViewForum

1536715

5193415

223289

2433

631

2134

91

AddPost

229295

206833

876

244

939

847

3

After

ViewThread

116150

7897615

15376

9765

11

808

1

ViewForum

76486

716541

6650

9718

7

73

0

AddPost

21708

383221

1527

448

48

855

3

 Now that looks like a nice spreadsheet to hand to management come review time!

Summary

Over the course of this 2-part article, I've demonstrated how to tackle SQL Server performance problems using the following steps:

  1. Find the queries that constitute a typical workload, using SQL Profiler
  2. Aggregate at the stored procedure level the execution characteristics, provided by the trace, in order to find the stored procedures that are having the most impact on the system.
  3. Use query statistics to pinpoint the queries that are causing the biggest problems, and the query execution plans to find out which operations are the root cause.
  4. Identify the indexes that can optimize the plans and therefore the overall query execution.

It's probably apparent at this point that finding good indexes for a query is partially art and partially science, and a lot of trial and error. Identifying good indexes requires knowledge of how indexes work and how SQL uses them, knowledge of the application and the data, and a fair amount of patience. For further information on how SQL uses indexes, see this short blog series, anything index-related on Kimberly Tripp's blog and the recently released book “SQL Server 2008 Query Performance Tuning Distilled” which devotes an entire chapter to index selection.

The final point to note is that, in these examples, the queries were all quite well written and, as such, did not need extensive modifications. More often, however, performance tuning is a mixture of fixing the code and tuning the indexes. Fixing the code could entail anything from converting cursors into set-based code, to tweaking the WHERE clause so that it's SARGable, to removing redundant queries. But that's a topic for another article!

Gail Shaw

Author profile:

Gail Shaw, famous for her forum contributions under the pen name 'gilamonster', is from Johannesburg Gauteng and works as a database consultant in performance tuning and database optimisation. Before moving to consulting she worked at a large South African investment bank and was responsible for the performance of the major systems there.

Search for other articles by Gail Shaw

Rate this article:   Avg rating: from a total of 150 votes.


Poor

OK

Good

Great

Must read
Have Your Say
Do you have an opinion on this article? Then add your comment below:
You must be logged in to post to this forum

Click here to log in.


Subject: Wonderful
Posted by: sagreene (view profile)
Posted on: Monday, April 06, 2009 at 7:13 PM
Message: Thanks for the well written peek inside your optimazing mind!

Subject: Sam
Posted by: Anonymous (not signed in)
Posted on: Tuesday, April 07, 2009 at 12:59 AM
Message: ExCellent gail m really a big fan of ur suggestions.... thankz gail

Subject: just a side question
Posted by: Anonymous (not signed in)
Posted on: Monday, April 13, 2009 at 3:08 AM
Message: I dont get it... Where are the ads on this site?

Subject: re: just a side question
Posted by: Andrew Clarke (view profile)
Posted on: Tuesday, April 14, 2009 at 2:48 AM
Message: There aren't any ads. We think that ads are distracting. The funding for Simple-Talk comes from Red Gate software.

Subject: You really should qualify your object names...
Posted by: Crispin Proctor (not signed in)
Posted on: Thursday, April 16, 2009 at 4:54 AM
Message: It is a surprising overhead when names are not owner qualified. SQL 2000 was really bad, some improvements in 2005 onwards.

See: http://sqlblog.com/blogs/linchi_shea/archive/2007/07/05/performance-impact-of-procedure-calls-without-owner-qualification-sql-server-2000.aspx for more details.


Crispin

Subject: Awesome practical tips
Posted by: Jack the DBA (view profile)
Posted on: Thursday, April 16, 2009 at 9:48 AM
Message: Love the practical tips Gail. Keep the articles coming.

Subject: Incredible
Posted by: Jeff Moden (view profile)
Posted on: Thursday, April 16, 2009 at 10:42 AM
Message: These 2 articles teach in just a couple of hours what others take 2 or more days to teach. Absolutely outstanding. Thanks for taking the time, Gail.

Subject: well written
Posted by: Jerry Hung (not signed in)
Posted on: Thursday, April 16, 2009 at 1:30 PM
Message: Well written, Gail, I like this approach of explaining the steps for each Execution Plan

We all pretty much do the same, except I usually jump to the largest % icon without looking at the whole plan

My question is
Did you run the stored procedures in SSMS 2008, and did SSMS recommend ANY "missing index"?
If so, any index same as your indexes?


Subject: Useful Information
Posted by: Harinath (not signed in)
Posted on: Thursday, April 16, 2009 at 1:46 PM
Message: Hi Gail Shaw,
Really most usefull information i read, keep it up.

Like the paractical tips of Gail.Do post.


Subject: Missing indexes
Posted by: GilaMonster (view profile)
Posted on: Thursday, April 16, 2009 at 2:47 PM
Message: Jerry, I ran all of these in 2008's management studio. I just edited out the missing index line when I put the screen shots together because I didn't want it to confuse the issue.

There were missing indexes on both of the initial queries. The index recommended in the first case had the same predicate as the one I suggested, but was covering, complete with the VARCHAR(MAX) column. For the second query SQL recommended a covering index with just the forumID in the key.

The missing index info's there in SQL 2005 as well. It's buried in the XML of the showplan.

The danger in jumping straight to the operator with the largest % is that the percentages are estimates and can be completely and totally wrong. I didn't really touch on that here, I think I need to write something on spotting bad estimates....

Subject: Well done
Posted by: Bruce W Cassidy (view profile)
Posted on: Thursday, April 16, 2009 at 3:18 PM
Message: Hi Gail,

An excellent followup to the original answer. I particularly like the fact that although you are using a synthetic problem as the basis of an article, it still seems quite realistic. Particularly the conversations with the developers!

I also like the fact that the examples you have given of poorly performing code are also things I have seen in the real world. I feel that you did an excellent job of selecting realistic examples.

There were a couple of areas where I think the text could have used better editing and proof reading (missing or incorrect words), but I think even the best of professional authors face that issue! So that's not really a critism.

I like the pacing and the way you methodically work through what needs to be done.

All in all, an excellent introduction to an incredibly complex subject. I gave you five stars. :-)

Subject: Nice
Posted by: Anonymous (not signed in)
Posted on: Thursday, April 16, 2009 at 9:35 PM
Message: It's really good. Very well explained. Especially for a beginner like me. Looking forward for another article about sql server. :D

Subject: Great Article
Posted by: Dustin Jones (not signed in)
Posted on: Friday, April 17, 2009 at 1:14 PM
Message: Great article! Well written, easy to follow examples, great explanations. Thank you!

Subject: Appreciate ur work!
Posted by: chandra (view profile)
Posted on: Wednesday, April 22, 2009 at 6:57 PM
Message: Hey...good on u mate!
Quite helpful

Subject: Great Article
Posted by: Anonymous (not signed in)
Posted on: Thursday, April 23, 2009 at 3:35 AM
Message: Appreciate your great work.

Subject: Great but you can do better
Posted by: alexlpi (view profile)
Posted on: Monday, May 04, 2009 at 10:43 AM
Message: Here is how you CAN do about ViewForum's "The IOs on the Users table" - push joining to Users toward the end of the query plan. This way you execute nested loops only for the 20 output records:
alter PROCEDURE [dbo].[ViewForum] (@ForumID INT, @PageNo SMALLINT, @UserID INT)
AS
begin
UPDATE Users set CurrentStatus = 'Viewing Forum', ActiveEntity = @ForumID
WHERE UserID = @UserID

;with Forum as
(
select
ForumName
, Forums.Descr
, Threads.ThreadID
, Threads.Title
, Threads.CreatedOn
, Threads.LastModified
, Threads.Locked
, Threads.TotalReplies
, Threads.TotalViews
, Threads.CreatedBy
, Threads.LastPoster
, ROW_NUMBER() OVER (ORDER BY Threads.LastModified DESC) AS ThreadNumber
from Forums
join Threads on dbo.Forums.ForumID = Threads.ForumID
WHERE Threads.ForumID = @ForumID
)
,ForumPage as
(
select * from Forum
where ThreadNumber BETWEEN @PageNo*20 AND (@PageNo+1)*20 -1
)
SELECT ForumPage.*
, Creator.UserName AS ThreadCreator
, LastPoster.UserName AS ThreadLastPoster
from ForumPage
join Users Creator on ForumPage.CreatedBy = Creator.UserID
join Users LastPoster ON ForumPage.LastPoster = LastPoster.UserID

end -- ViewForum

Subject: Great but you can do better
Posted by: alexlpi (view profile)
Posted on: Monday, May 04, 2009 at 10:52 AM
Message: Here is how you CAN do about ViewForum's "The IOs on the Users table" - push joining to Users toward the end of the query plan. This way you execute nested loops only for the 20 output records:
alter PROCEDURE [dbo].[ViewForum] (@ForumID INT, @PageNo SMALLINT, @UserID INT)
AS
begin
UPDATE Users set CurrentStatus = 'Viewing Forum', ActiveEntity = @ForumID
WHERE UserID = @UserID

;with Forum as
(
select
ForumName
, Forums.Descr
, Threads.ThreadID
, Threads.Title
, Threads.CreatedOn
, Threads.LastModified
, Threads.Locked
, Threads.TotalReplies
, Threads.TotalViews
, Threads.CreatedBy
, Threads.LastPoster
, ROW_NUMBER() OVER (ORDER BY Threads.LastModified DESC) AS ThreadNumber
from Forums
join Threads on dbo.Forums.ForumID = Threads.ForumID
WHERE Threads.ForumID = @ForumID
)
,ForumPage as
(
select * from Forum
where ThreadNumber BETWEEN @PageNo*20 AND (@PageNo+1)*20 -1
)
SELECT ForumPage.*
, Creator.UserName AS ThreadCreator
, LastPoster.UserName AS ThreadLastPoster
from ForumPage
join Users Creator on ForumPage.CreatedBy = Creator.UserID
join Users LastPoster ON ForumPage.LastPoster = LastPoster.UserID

end -- ViewForum

Subject: Re: Great but you can do better
Posted by: GilaMonster (view profile)
Posted on: Monday, May 04, 2009 at 11:14 AM
Message: Alex, if you note, I said in the article:
"if there's a problem in future I'll probably have to rewrite the query as there's no more I can do on the indexing side to reduce the reads further"

Your rewrite is probably what I would do if there's a future problem, but short of rewriting the query, there's nothing to be done, and the main focus of this article was fixing indexes, not rewriting queries.

Subject: Re: Great but you can do better
Posted by: alexlpi (view profile)
Posted on: Monday, May 04, 2009 at 11:38 AM
Message: You are right.
My bad.

Subject: Re: Great but you can do better
Posted by: alexlpi (view profile)
Posted on: Monday, May 04, 2009 at 11:45 AM
Message: But again, IF you rewrite the query the way I am suggesting - you can remove CreatedBy and LastPoster from the index (and that is better on the indexing side, right? :-)).

Subject: Re: Great but you can do better
Posted by: alexlpi (view profile)
Posted on: Monday, May 04, 2009 at 12:30 PM
Message: Again, on the indexing side:
I can’t believe you really like a nonclustered index covering 9 columns out of total 11 in a table.
But IF you rewrite the main (see below) query then you can remove the whole INCLUDE clause and leave only
CREATE NONCLUSTERED INDEX idx_Threads_ForumIDLastModified
ON Threads (ForumID, LastModified DESC)
As a result you would have much smaller index and still get better/less reads.

declare @ForumID INT, @PageNo SMALLINT
set @ForumID = 1
set @PageNo = 1
;with Forum as
(
select
ForumName
, Threads.LastModified
, Threads.ThreadID
, Forums.Descr
, ROW_NUMBER() OVER (ORDER BY Threads.LastModified DESC) AS ThreadNumber
from Forums
join Threads on dbo.Forums.ForumID = Threads.ForumID
WHERE Threads.ForumID = @ForumID
)
,ForumPage as
(
select * from Forum
where ThreadNumber BETWEEN @PageNo*20 AND (@PageNo+1)*20 -1
)
SELECT ForumPage.*
, Creator.UserName AS ThreadCreator
, LastPoster.UserName AS ThreadLastPoster
, Threads.Title
, Threads.CreatedOn
, Threads.Locked
, Threads.TotalReplies
, Threads.TotalViews
, Threads.CreatedBy
, Threads.LastPoster
from ForumPage
join Threads on Threads.ThreadID = ForumPage.ThreadID
join Users Creator on Threads.CreatedBy = Creator.UserID
join Users LastPoster ON Threads.LastPoster = LastPoster.UserID

Subject: Covering
Posted by: GilaMonster (view profile)
Posted on: Monday, May 04, 2009 at 2:31 PM
Message: I don't like a covering index on 9 of 11 columns. I said as much in the article.


However without rewriting the query, which as I mentioned was beyond the scope of this article, it was the only option for getting the reads to an acceptable level.

If you remove CreatedBy and LastPoster from the index (and you can remove the rest of the included columns if you do that as the index is no longer covering), then the original query's IO stats are:

Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0.
Table 'Threads'. Scan count 1, logical reads 3557, physical reads 0.
Table 'Users'. Scan count 2, logical reads 12, physical reads 0.
Table 'Forums'. Scan count 0, logical reads 2, physical reads 0.

and your rewritten query's IO stats are:
Table 'Users'. Scan count 0, logical reads 80, physical reads 0.
Table 'Threads'. Scan count 1, logical reads 3617, physical reads 0.
Table 'Forums'. Scan count 0, logical reads 2, physical reads 0

An increase of 60 on the Threads table and 68 on the users table.

The exec time of your query is very slightly better, but not significantly so (around 50ms)

Subject: Re: Covering
Posted by: GilaMonster (view profile)
Posted on: Monday, May 04, 2009 at 2:42 PM
Message: That's with just the CreatedBy and LastPoster columns removed from the include.

With the rest of the include columns dropped as well, the exec plan goes back to the one that I showed in the article (maybe the wider but noncovering index was evaluated as less efficient than a clustered index scan), and the exec stats go to this.

Original:
Table 'Users'. Scan count 0, logical reads 236
Table 'Threads'. Scan count 1, logical reads 193

Revised:
Table 'Users'. Scan count 0, logical reads 80
Table 'Threads'. Scan count 1, logical reads 63

The execution times for the two are identical.

Subject: Re: Covering
Posted by: GilaMonster (view profile)
Posted on: Monday, May 04, 2009 at 2:57 PM
Message: Oh, and I should mention that I've deleted data since I wrote this, that's why the reads are lower than the ones in the article.

Subject: Re: Covering
Posted by: alexlpi (view profile)
Posted on: Monday, May 04, 2009 at 3:35 PM
Message: Gail,
Again - a great-great-great article! Many thanks!

Agreed - this discussion is beyond the scope of the article.

But if we do go into statistics (I used SQL 2005 and the database you've provided with the article, @ForumID=1) I can see three things:

1) The index without the INCLUDE is much smaller (the covering index actually covers not 9 but 10 columns)

2) Number of reads with the covering index, and the original query is higher AND increases with @PageNo:

@PageNo = 1
Table 'Users'. Scan count 0, logical reads 160
Table 'Threads'. Scan count 1, logical reads 3
Table 'Forums'. Scan count 0, logical reads 2

@PageNo = 5
Table 'Users'. Scan count 0, logical reads 480
Table 'Threads'. Scan count 1, logical reads 4
Table 'Forums'. Scan count 0, logical reads 2

@PageNo = 10
Table 'Users'. Scan count 0, logical reads 880
Table 'Threads'. Scan count 1, logical reads 6
Table 'Forums'. Scan count 0, logical reads 2

3) Number of reads for the modified query with the much smaller index without INCLUDE is lower AND does not increase with @PageNo:

@PageNo = 1
Table 'Users'. Scan count 0, logical reads 80
Table 'Threads'. Scan count 1, logical reads 42
Table 'Forums'. Scan count 0, logical reads 2

@PageNo = 5
Table 'Users'. Scan count 0, logical reads 80
Table 'Threads'. Scan count 1, logical reads 42
Table 'Forums'. Scan count 0, logical reads 2

@PageNo = 10
Table 'Users'. Scan count 0, logical reads 80
Table 'Threads'. Scan count 1, logical reads 42
Table 'Forums'. Scan count 0, logical reads 2

Subject: Re: Covering
Posted by: alexlpi (view profile)
Posted on: Monday, May 04, 2009 at 3:58 PM
Message: Gail,
Again - a great-great-great article! Many thanks!

Agreed - this discussion is beyond the scope of the article.

But if we do go into statistics (I used SQL 2005 and the database you've provided with the article, @ForumID=1) I can see three things:

1) The index without the INCLUDE is much smaller (the covering index actually covers not 9 but 10 columns)

2) Number of reads with the covering index, and the original query is higher AND increases with @PageNo:

@PageNo = 1
Table 'Users'. Scan count 0, logical reads 160
Table 'Threads'. Scan count 1, logical reads 3
Table 'Forums'. Scan count 0, logical reads 2

@PageNo = 5
Table 'Users'. Scan count 0, logical reads 480
Table 'Threads'. Scan count 1, logical reads 4
Table 'Forums'. Scan count 0, logical reads 2

@PageNo = 10
Table 'Users'. Scan count 0, logical reads 880
Table 'Threads'. Scan count 1, logical reads 6
Table 'Forums'. Scan count 0, logical reads 2

3) Number of reads for the modified query with the much smaller index without INCLUDE is lower AND does not increase with @PageNo:

@PageNo = 1
Table 'Users'. Scan count 0, logical reads 80
Table 'Threads'. Scan count 1, logical reads 42
Table 'Forums'. Scan count 0, logical reads 2

@PageNo = 5
Table 'Users'. Scan count 0, logical reads 80
Table 'Threads'. Scan count 1, logical reads 42
Table 'Forums'. Scan count 0, logical reads 2

@PageNo = 10
Table 'Users'. Scan count 0, logical reads 80
Table 'Threads'. Scan count 1, logical reads 42
Table 'Forums'. Scan count 0, logical reads 2

Subject: Re: Covering
Posted by: GilaMonster (view profile)
Posted on: Monday, May 04, 2009 at 4:26 PM
Message: 1) Yes it is. Whether that's significant or not, whether it's a problem or not is not clear. There's no hard and fast rule as to when an index is too wide. It has to be evaluated in context of the app and in the context of the queries using the index.

2) Yes it does, because SQL's using the Modified date to push a kinda TOP predicate down into the index seek and the higher the PageNo, the larger the number of rows allowed by that TOP.

3) Not in my tests. I have the number of reads on Threads slowly increasing as the page no increases. That may be a virtue of different data. The version you downloaded is much smaller, even after I cut mine down a bit.
What I have for reads on Threads (using ForumId 8) is
PageNo = 1 Reads = 63
PageNo = 5 Reads = 64
PageNo = 10 Reads = 67
It's increasing slowly because the index is much 'denser', but it is still increasing.
The reads on Users won't change, because it's joined in later after all the filters on the threads table are done.

Subject: Re: Covering
Posted by: alexlpi (view profile)
Posted on: Monday, May 04, 2009 at 10:31 PM
Message: I rest my case :-)

Subject: Best I've seen on the subject
Posted by: Tom Brown (view profile)
Posted on: Friday, June 19, 2009 at 6:41 AM
Message: I've struggled through books on interpreting plans, and this article beats them all.
I really like the clear way you step through the plans, explaining how they relate to the query.

I did get a bit lost towards the end of the "Indexing the Post Table" section - I'm not sure where the Bookmark lookup came in, or how you could tell that from the plan.

Thanks for a Great article.

Subject: Bookmark lookup
Posted by: GilaMonster (view profile)
Posted on: Friday, June 19, 2009 at 10:06 AM
Message: If you look at listing 15, on the right-hand side of the exec plan there's an index seek and a key lookup. That key lookup is the bookmark lookup (by any other name)

Still not clear?

Subject: Great Dissection of Execution Plan!
Posted by: sqlCrazy (view profile)
Posted on: Sunday, September 19, 2010 at 5:03 PM
Message: Thank you Gail for a great article.
I have really liked the way you have dissected each operator from the execution plan, analyzed and suggested solution by trials with good reasoning.The article cleared up lots of ambiguities that I had working with big execution plans like these.

Subject: How to obtain the store procedure
Posted by: chungboon (view profile)
Posted on: Thursday, July 04, 2013 at 1:34 AM
Message: Hi,

How to retrieve the stored procedure for your procedureName, so that i can proceed to view its statistic and execution plan.

Can you explain in more details for me. Thanks

Regards,
Chung boon

 

Phil Factor
Searching for Strings in SQL Server Databases

Sometimes, you just want to do a search in a SQL Server database as if you were using a search engine like Google.... Read more...

 View the blog

Top Rated

SQL Server XML Questions You Were Too Shy To Ask
 Sometimes, XML seems a bewildering convention that offers solutions to problems that the average... Read more...

Continuous Delivery and the Database
 Continuous Delivery is fairly generally understood to be an effective way of tackling the problems of... Read more...

The SQL Server Sqlio Utility
 If, before deployment, you need to push the limits of your disk subsystem in order to determine whether... Read more...

The PoSh DBA - Reading and Filtering Errors
 DBAs regularly need to keep an eye on the error logs of all their SQL Servers, and the event logs of... Read more...

MySQL Compare: The Manual That Time Forgot, Part 1
 Although SQL Compare, for SQL Server, is one of Red Gate's best-known products, there are also 'sister'... Read more...

Most Viewed

Beginning SQL Server 2005 Reporting Services Part 1
 Steve Joubert begins an in-depth tour of SQL Server 2005 Reporting Services with a step-by-step guide... Read more...

Ten Common Database Design Mistakes
 If database design is done right, then the development, deployment and subsequent performance in... Read more...

SQL Server Index Basics
 Given the fundamental importance of indexes in databases, it always comes as a surprise how often the... Read more...

Reading and Writing Files in SQL Server using T-SQL
 SQL Server provides several "standard" techniques by which to read and write to files but, just... Read more...

Concatenating Row Values in Transact-SQL
 It is an interesting problem in Transact SQL, for which there are a number of solutions and... Read more...

Why Join

Over 400,000 Microsoft professionals subscribe to the Simple-Talk technical journal. Join today, it's fast, simple, free and secure.