Click here to monitor SSC
  • Av rating:
  • Total votes: 242
  • Total comments: 21
Grant Fritchey

The Seven Sins against TSQL Performance

31 July 2012

There are seven common antipatterns in TSQL coding that make code perform badly, and three good habits which will generally ensure that your code runs fast. If you learn nothing else from this list of great advice from Grant, just keep in mind that you should 'write for the optimizer'.

It’s not enough that your code is readable: it must perform well too.

There are three very basic rules for writing T-SQL that performs. They are cumulative. Doing all of these will have a positive impact. Skipping or changing any of these is likely to have a negative impact on your code’s performance.

  • Write to your data structures: If you are storing a datetime data type, use that, not a varchar or something else. Also take advantage of the foreign key constraints and other structures that you have in place when you’re writing your T-SQL code
  • Write for your indexes: If there are indexes on your tables, and there should be, write your code so that it can take advantage of those indexes. Make sure that your clustered index, you only get one, is working well to help drive your queries to your data in an efficient manner.
  • Write for the optimizer: The query optimizer is an amazing piece of software. But, you can overwhelm it by writing code that isn’t configured best to support it, such as nesting views within views within views, etc. Take the time to understand how the optimizer works and write your code in such a way that you help it, not hurt it.

Then, there are some very specific and common mistakes that people make in their T-SQL code. Just don’t do these and you’ll be much better off.

Using the wrong data types

This is surprisingly simple in concept, but seems to be incredibly difficult in practice. Here you go… use the data type that is in your database. Use it in your parameters and in your variables. I know that SQL Server can implicitly convert from one to another. But when you get implicit conversions, or you have to put in explicit conversions, you’re performing a function on your columns. When you perform a function on your columns in any of the filtering scenarios, that’s a WHERE clause or JOIN criteria, you’re looking at generating table scans. You may have a perfectly good index, but because you’re doing a CAST on the column in order to compare to a character type that you passed in instead of a date, that index won’t get used.

Don’t believe me? Check this query:

SELECT  e.BusinessEntityID,

        e.NationalIDNumber

FROM    HumanResources.Employee AS e

WHERE   e.NationalIDNumber = 112457891;

Nice & simple. Should be covered by an index on the table. But here’s the execution plan:

This query ran fast enough, and the table is small, so there were only 4 reads to scan the entire thing. Notice that little exclamation point on the SELECT operator though. What do the properties tell us:

That’s right. It’s a warning (new in SQL Server 2012) that you have a “PlanAffectingConvert.” In short, because I didn’t use the correct data type:

 

SELECT  e.BusinessEntityID,

        e.NationalIDNumber

FROM    HumanResources.Employee AS e

WHERE   e.NationalIDNumber = '112457891';

I didn’t get a plan that looked like this:

And there were only 2 reads this time instead of 4. And yes, I realize I only cut performance from a little to a very little, but what if that table had millions of rows instead of only a few? Ah, now I’m suddenly a hero.

Use the correct data types.

Using Functions in Comparisons within the ON or WHERE Clause

Speaking of functions, many, most, of the functions that you run against your columns in WHERE and ON clauses will prevent the proper use of indexes. You will see slower performance since SQL Server has to perform scans against the data in order to take into account your function. For example this:

 

SELECT  a.AddressLine1,

        a.AddressLine2,

        a.City,

        a.StateProvinceID

FROM    Person.Address AS a

WHERE   '4444' = LEFT(a.AddressLine1, 4) ;

That function, the LEFT operation on the column, results in this execution plan:

This took 316 reads to find the correct data over about 9ms (I have very fast disks), all because it must process each and every row to determine where the information matches the function. It can’t simply scan within the existing data. But, not all functions are created equal. You can do something like this:

SELECT  a.AddressLine1,

        a.AddressLine2,

        a.City,

        a.StateProvinceID

FROM    Person.Address AS a

WHERE   a.AddressLine1 LIKE '4444%' ;

Which results in a very different execution plan:

This resulted in 3 reads and 0ms. Let’s call it 1ms just to be fair. That’s a huge performance increase. All because I used a function that could work with search arguments. The old term for this, not used much any more, is sargeable. Either don’t use functions in comparisons, or use ones that are sargeable.

Employing Multi-Statement User Defined Functions (UDFs)

These things are a trap. It appears, at first blush, like this marvelous mechanism that’s going to allow you to use T-SQL like a real programming language, not just a scripting language. You can build these functions and then have them call each other and your code will be very reusable, unlike those nasty old stored procedures. It’ll be a glory… Until you try to run your code with more than a few rows.

The problem here is that the multi-statement user defined function (UDF) is designed around the table variable. Table variables are very cool things for what they’re meant to do. They have one distinct difference from temporary tables, table variables have no statistics. This can be a time saving, even a life-saving difference… or… it can kill you. If you don’t have statistics, the optimizer assumes that any query run against the table variable, or UDF, will result in a single row. One (1) row. That’s great if you’re only returning a few rows. But once you start returning hundreds, or thousands, and you’ve decided to join one UDF to another… performance drops very, very quickly and very, very far.

The examples here are somewhat long. Here are a series of UDFs:

CREATE FUNCTION dbo.SalesInfo ()

RETURNS @return_variable TABLE

    (

     SalesOrderID INT,

     OrderDate DATETIME,

     SalesPersonID INT,

     PurchaseOrderNumber dbo.OrderNumber,

     AccountNumber dbo.AccountNumber,

     ShippingCity NVARCHAR(30)

    )

AS

    BEGIN;

        INSERT  INTO @return_variable

                (SalesOrderID,

                 OrderDate,

                 SalesPersonID,

                 PurchaseOrderNumber,

                 AccountNumber,

                 ShippingCity

                )

                SELECT  soh.SalesOrderID,

                        soh.OrderDate,

                        soh.SalesPersonID,

                        soh.PurchaseOrderNumber,

                        soh.AccountNumber,

                        a.City

                FROM    Sales.SalesOrderHeader AS soh

                        JOIN Person.Address AS a

                        ON soh.ShipToAddressID = a.AddressID ;

        RETURN ;

    END ;

GO

 

CREATE FUNCTION dbo.SalesDetails ()

RETURNS @return_variable TABLE

    (

     SalesOrderID INT,

     SalesOrderDetailID INT,

     OrderQty SMALLINT,

     UnitPrice MONEY

    )

AS

    BEGIN;

        INSERT  INTO @return_variable

                (SalesOrderID,

                 SalesOrderDetailId,

                 OrderQty,

                 UnitPrice

                )

                SELECT  sod.SalesOrderID,

                        sod.SalesOrderDetailID,

                        sod.OrderQty,

                        sod.UnitPrice

                FROM    Sales.SalesOrderDetail AS sod ;

        RETURN ;

    END ;

GO

 

 

CREATE FUNCTION dbo.CombinedSalesInfo ()

RETURNS @return_variable TABLE

    (

     SalesPersonID INT,

     ShippingCity NVARCHAR(30),

     OrderDate DATETIME,

     PurchaseOrderNumber dbo.OrderNumber,

     AccountNumber dbo.AccountNumber,

     OrderQty SMALLINT,

     UnitPrice MONEY

    )

AS

    BEGIN;

        INSERT  INTO @return_variable

                (SalesPersonId,

                 ShippingCity,

                 OrderDate,

                 PurchaseOrderNumber,

                 AccountNumber,

                 OrderQty,

                 UnitPrice

                )

                SELECT  si.SalesPersonID,

                        si.ShippingCity,

                        si.OrderDate,

                        si.PurchaseOrderNumber,

                        si.AccountNumber,

                        sd.OrderQty,

                        sd.UnitPrice

                FROM    dbo.SalesInfo() AS si

                        JOIN dbo.SalesDetails() AS sd

                        ON si.SalesOrderID = sd.SalesOrderID ;

        RETURN ;

    END ;

GO

This is a great structure. It makes putting together queries very simple. Here’s an example:

SELECT  csi.OrderDate,

        csi.PurchaseOrderNumber,

        csi.AccountNumber,

        csi.OrderQty,

        csi.UnitPrice

FROM    dbo.CombinedSalesInfo() AS csi

WHERE   csi.SalesPersonID = 277

        AND csi.ShippingCity = 'Odessa' ;

That is one very simple query. Here’s the execution plan, also very simple:

It only took 2170ms to run, returning 148 rows with 1456 reads. Note, that the function is listed as having zero cost and it’s just the scan of the table, a table variable, that is paid for in the query. Or is it? Here’s a little query that will let me get a look at what’s going on behind that zero cost function operator. This pulls the plan for the function from the cache:

SELECT  deqp.query_plan,

 dest.text,

 SUBSTRING(dest.text, (deqs.statement_start_offset / 2) + 1,

 (deqs.statement_end_offset - deqs.statement_start_offset)

 / 2 + 1) AS actualstatement

 FROM    sys.dm_exec_query_stats AS deqs

 CROSS APPLY sys.dm_exec_query_plan(deqs.plan_handle) AS deqp

 CROSS APPLY sys.dm_exec_sql_text(deqs.sql_handle) AS dest

 WHERE   deqp.objectid = OBJECT_ID('dbo.CombinedSalesInfo');

And here is what is actually going on:

Oops, looks like even more of those little functions and scans that are almost, but not quite, free. Plus a Hash Match operation, which writes out to tempdb and has actual real costs for the SQL engine. Let’s look at one more of those UDF execution plans:

Whoops! Now we’re getting a Clustered Index Scan and quite a large number of rows being scanned. This is not pretty. And there is more than one of these processes occurring. Suddenly, the UDF looks a lot less attractive. What if we were to, oh, I don’t know, just query the tables directly like this:

SELECT  soh.OrderDate,

        soh.PurchaseOrderNumber,

        soh.AccountNumber,

        sod.OrderQty,

        sod.UnitPrice

FROM    Sales.SalesOrderHeader AS soh

        JOIN Sales.SalesOrderDetail AS sod

        ON soh.SalesOrderID = sod.SalesOrderID

        JOIN Person.Address AS ba

        ON soh.BillToAddressID = ba.AddressID

        JOIN Person.Address AS sa

        ON soh.ShipToAddressID = sa.AddressID

WHERE   soh.SalesPersonID = 277

        AND sa.City = 'Odessa' ;

 

Now, running this query returns exactly the same rows, but it does it in 310ms instead of 2170ms. We also end up with only about 911 reads instead of the 1456. Clearing, it’s not very hard to have problems with table valued user-defined functions.

The “Run Faster” Switch: Allowing ‘Dirty Reads’

Back in the old days of computing, for some reason, our 286 units came with a button on the front called the Turbo button. If you accidently hit it, suddenly your machine was running extremely slowly. So you made sure that silly thing was always switched on so that you received maximum throughput. Well, a lot of people look up on READ_UNCOMMITTED isolation level and the NO_LOCK query hint as the turbo button for SQL Server. Make sure they’re on every query and your system will run faster. This is because these commands make it so that no locks are taken out by shared reads. Fewer locks mean faster processing, easy. But…

When you run this, you will end up with dirty reads. Everyone assumes this means that ‘dog’ will read ‘cat’ until it gets updated, which is true. But, you’ll also get extra rows, fewer rows, duplicated rows, as the pages are rearranged underneath your query where you have no locks to prevent it. I don’t know about you, but most of the businesses I’ve worked for expect that the most of the queries on most of the systems will return consistent data. Same set of inputs against the same set of values and you get the same results. Not with NO_LOCK. For testing this, I strongly recommend you read this blog post from Jason Strate. He illustrates the issue in perfect clarity

Applying Query Hints indiscriminately

People are just a little too quick to pull the trigger on query hints. The most common situation I’ve seen is when a hint is used to fix one, very distinct problem on one query. But, when the database professionals involved see the improvement in speed, they immediately apply that hint… everywhere.

For example, a lot of people are under the impression that the LOOP JOIN operator is the best for joins. This is because it’s most frequently seen in small, fast queries. So, people will try to force it. It’s not hard:

SELECT  s.[Name] AS StoreName,

        p.LastName + ', ' + p.FirstName

FROM    Sales.Store AS s

        JOIN sales.SalesPerson AS sp

        ON s.SalesPersonID = sp.BusinessEntityID

        JOIN HumanResources.Employee AS e

        ON sp.BusinessEntityID = e.BusinessEntityID

        JOIN Person.Person AS p

        ON e.BusinessEntityID = p.BusinessEntityID

OPTION  (LOOP JOIN);

This query will run in about 101ms.It has 4115 reads. Not bad performance, but if we drop the hint, the same query runs in about 90ms, but it only has 2370 reads. As the system gets under more & more load, this is going to show itself as the much more efficient query.

Here’s another example. This one is slightly contrived, but it gets the point across. People often put an index on their table, expecting it to solve the problem. Here we have a query:

SELECT  *

FROM    Purchasing.PurchaseOrderHeader AS poh

WHERE   poh.PurchaseOrderID * 2 = 3400;

The problem is that you’re running a function against the column, not that any index that has been created is not adequate. The performance stinks because it’s doing a full scan of the clustered index. But when people see that the index they just created is not getting used, they do this:

SELECT  *

FROM    Purchasing.PurchaseOrderHeader AS poh WITH (INDEX (PK_PurchaseOrderHeader_PurchaseOrderID))

WHERE   poh.PurchaseOrderID * 2 = 3400;

Now, they’re getting an index scan instead of a clustered index scan, so the index is being “used” now right? But performance changed, going from 11 reads to 44 reads (time was both near 0ms, I have fast disks). Used is the term, but not the way that was meant. The solution of course is to restructure the query:

SELECT  *

FROM    Purchasing.PurchaseOrderHeader poh

WHERE   PurchaseOrderID = 3400 / 2;

Now the number of reads drops to 2 because the index is being used in a seek, as it was meant to be.

Query hints need to be considered a last resort after all other choices have been eliminated.

Allowing ‘Row By Agonizing Row’ processing

The phrase, Row By Agonizing Row, is shortened to RBAR (pronounced rebar). This comes from using cursors or WHILE loops instead of set-based operations. This leads to extremely slow performance. The use of cursors comes from two sources. Either developers who, appropriately so, are used to row by row processing in their code, or Oracle people, who think that cursors are a good thing (guys, SQL Server cursors are hidden in the execution plan). Either way, cursors are quick performance death.

The example is pretty straight-forward. We’re going to update the color of products where they match a certain criteria. Is this contrived? No, actually, it’s based on real code that I had to tune:

BEGIN TRANSACTION

DECLARE @Name NVARCHAR(50) ,

    @Color NVARCHAR(15) ,

    @Weight DECIMAL(8, 2)

DECLARE BigUpdate CURSOR

FOR SELECT  p.[Name]

,p.Color

,p.[Weight]

FROM    Production.Product AS p ;

OPEN BigUpdate ;

 

FETCH NEXT FROM BigUpdate INTO @Name, @Color, @Weight ;

 

WHILE @@FETCH_STATUS = 0

    BEGIN

        IF @Weight < 3

            BEGIN

                UPDATE  Production.Product

                SET     Color = 'Blue'

                WHERE CURRENT OF BigUpdate

            END

 

        FETCH NEXT FROM BigUpdate INTO @Name, @Color, @Weight ;

 

    END

CLOSE BigUpdate ;

DEALLOCATE BigUpdate ;

 

SELECT  *

FROM    Production.Product AS p

WHERE   Color = 'Blue' ;

 

ROLLBACK TRANSACTION

Not only is there no real possibility of code reuse, but we’re doing 2 reads on every single loop of this operation, for all the rows in the system for a total in the hundreds. Execution time on my machine with no other load in place is well over a second. It’s insane. The set-based operation is pretty simple:

BEGIN TRANSACTION

        

UPDATE  Production.Product

SET     Color = 'BLUE'

WHERE   [Weight] < 3 ;

 

ROLLBACK TRANSACTION

Here you get 15 reads and 1ms execution. Don’t laugh. People really do stuff like this, and worse. Cursors are just something to be avoided outside of areas where you have no choice, like maintenance routines that have walk tables or databases.

Indulging in Nested Views

Views which call views that join to views which are calling other views… A view is nothing but a query. But, because they appear to act like tables, people can come to think of them as tables. They’re not. What happens when you combine a view with a view and then nest them inside each other, etc., is that you’ve just created an incredibly complex execution plan. The optimizer will attempt to simplify things. It will try to come up with plans that don’t use every table referenced. But, it will only attempt to clean up your plans so many times. The more complex they get, the less likely that you’ll get a cleaned up plan. Then, performance becomes extremely problematic and inconsistent.

Here are a series of fairly simple queries defined as views:

 

CREATE VIEW dbo.SalesInfoView

AS

    SELECT  soh.SalesOrderID,

            soh.OrderDate,

            soh.SalesPersonID,

            soh.PurchaseOrderNumber,

            soh.AccountNumber,

            a.City AS ShippingCity

    FROM    Sales.SalesOrderHeader AS soh

            JOIN Person.Address AS a

            ON soh.ShipToAddressID = a.AddressID ;

 

 

CREATE VIEW dbo.SalesDetailsView

AS

    SELECT  sod.SalesOrderID,

            sod.SalesOrderDetailID,

            sod.OrderQty,

            sod.UnitPrice

    FROM    Sales.SalesOrderDetail AS sod ;

 

 

CREATE VIEW dbo.CombinedSalesInfoView

AS

    SELECT  si.SalesPersonID,

            si.ShippingCity,

            si.OrderDate,

            si.PurchaseOrderNumber,

            si.AccountNumber,

            sd.OrderQty,

            sd.UnitPrice

    FROM    dbo.SalesInfoView AS si

            JOIN dbo.SalesDetailsView AS sd

            ON si.SalesOrderID = sd.SalesOrderID ;

You get a query that runs in about 155ms and does 965 reads against two tables. Here’s the execution plan:

Looks pretty benign really. You’re returning 7000 rows, so this is not completely out to lunch. But what if we just ran this query instead:

SELECT  soh.OrderDate

FROM    Sales.SalesOrderHeader AS soh

WHERE   soh.SalesPersonID = 277 ;

Now we’re looking at 3ms and 685 reads. That’s radically different. Here’s the execution plan:

What you’re seeing is the fact that, try though it might, the optimizer couldn’t eliminate every table from the query as part of it’s simplification process. So you have two extra operations in your query, the Index Scan and the Hash Match join that puts it all together. You’d be so much better off avoiding that work by writing your queries directly rather than using views. And remember, this example is very simple. Most examples I see of this are much more complex, resulting in much worse performance.

If you have enjoyed this article, we suggest you take a look at Grant's humorous video on the importance of verifying your backups.

Grant Fritchey

Author profile:

Grant Fritchey, SQL Server MVP, works for Red Gate Software as Product Evangelist. In his time as a DBA and developer, he has worked at three failed dot–coms, a major consulting company, a global bank and an international insurance & engineering company. Grant volunteers for the Professional Association of SQL Server Users (PASS). He is the author of the books SQL Server Execution Plans (Simple-Talk) and SQL Server 2008 Query Performance Tuning Distilled (Apress). He is one of the founding officers of the Southern New England SQL Server Users Group (SNESSUG) and it’s current president. He earned the nickname “The Scary DBA.” He even has an official name plate, and displays it proudly.

Search for other articles by Grant Fritchey

Rate this article:   Avg rating: from a total of 242 votes.


Poor

OK

Good

Great

Must read
Have Your Say
Do you have an opinion on this article? Then add your comment below:
You must be logged in to post to this forum

Click here to log in.


Subject: For the App Devs
Posted by: SqlNightOwl (view profile)
Posted on: Tuesday, July 31, 2012 at 6:10 AM
Message: Posting this on our company wiki. Our dev's complain that the database is slow -- from what I see in the plan cache, it's their code and not the database. I see this all over the place in the form of single use plans. ORM's aren't evil, they're just used that way.

Subject: ORMs
Posted by: Grant Fritchey (view profile)
Posted on: Tuesday, July 31, 2012 at 10:59 AM
Message: ORM tools are absolutely fantastic. I think it's great what they can do. But it does seem to be a very common case for people to use them in horrifically inappropriate ways. Clearly more education is needed.

Subject: Cursors
Posted by: Alex Fekken (not signed in)
Posted on: Monday, August 06, 2012 at 1:25 AM
Message: Thanks Grant, useful tips that are often ignored.

But I have always wondered about why "Oracle people[, who] think that cursors are a good thing". Isn't it so that there is nothing inherently bad about cursors (except perhaps from an educational perspective when they are easily avoided) other than that the SQL Server implementation for them is so incredibly inefficient?

Subject: You call that a nested view?
Posted by: callcopse (view profile)
Posted on: Monday, August 06, 2012 at 2:23 AM
Message: I personally love ORMs. If used to eliminate the grind program they are an effective tool for speeding up development. If used in clever ways they tend to degrade performance.

The Nested Views bit made me laugh mirthlessly though. I've got some embedded 7 deep - that use random UNIONs with magic numbers and distinct clauses. I curse life and mankind every time I need to look at them (often after 3 days without even a tentative effort at editing). Performance is - as you would expect. Unfortunately with no clues as to the overall desired effect in terms of comments etc I have not yet dared rewrite as mildly sensible code.

Subject: Orace Cursors
Posted by: Grant Fritchey (view profile)
Posted on: Monday, August 06, 2012 at 4:22 AM
Message: The reason there is so much confusion is because Oracle has a couple of different kinds of cursors. They do have the same one we have in SQL Server, but I don't they like that any more than we like ours. They also have cursors that we have & use every day, we just don't call them cursors. Look up the definitions for Loop Join, Hash Join and Merge Join. Those are cursors too. They're the ones that Oracle people are constantly looking for within SQL Server.

Subject: Nested Views
Posted by: Grant Fritchey (view profile)
Posted on: Monday, August 06, 2012 at 4:24 AM
Message: Yeah, that's not going to be fun. Nothing for it but to do the hard work & replace them.

Subject: clustered index,composite primary key, Common table expression
Posted by: waqas (not signed in)
Posted on: Monday, August 06, 2012 at 5:52 AM
Message: Very useful tips Grant,Thanks

What would be your opinion in terms of performance when unique clustered index is used on more than two columns to create composite primary key?

What would you say about common table expression based queries?


Subject: missing nested view "bad query"
Posted by: Anonymous (not signed in)
Posted on: Monday, August 06, 2012 at 9:29 AM
Message: Looks like you're missing the "bad query" for nested views. You define the views and have the execution plan, but don't state the query.

Great article nonetheless. Thanx!

Gary

Subject: Missing query
Posted by: Grant Fritchey (view profile)
Posted on: Tuesday, August 07, 2012 at 5:53 AM
Message: Sorry about that. It's just the same query as the one below it but against the view:
SELECT soh.OrderDate

FROM dbo.CombinedSalesInfoView AS soh

WHERE soh.SalesPersonID = 277 ;

Subject: "fairly simple queries defined as views"
Posted by: G (not signed in)
Posted on: Tuesday, August 07, 2012 at 6:06 PM
Message: Re: nested views

Apart from missing the query you are missing the data model. Table definitions, constraints, data types - as you seem suggest in your first example - are all relevant, especially when re-writing queries.

For example, the fact that the Address table doesn't appear in your first execution plan suggests to me that SalesOrderHeader.ShipToAddressID is mandatory and there is a FK constraint on that table-column to Address.AddressID. Take any of these conditions away and these 2 queries are no longer equivalent:

SELECT soh.OrderDate
FROM Sales.SalesOrderHeader AS soh
WHERE soh.SalesPersonID = 277 ;

SELECT soh.OrderDate
FROM dbo.SalesInfoView AS soh
WHERE soh.SalesPersonID = 277 ;

As for the query you posted above:

SELECT soh.OrderDate
FROM dbo.CombinedSalesInfoView AS soh
WHERE soh.SalesPersonID = 277 ;

I'm not sure why you consider it equivalent to

SELECT soh.OrderDate
FROM Sales.SalesOrderHeader AS soh
WHERE soh.SalesPersonID = 277 ;

Not unless you know, and don't tell us, that for every SalesOrderHeader record there always is one and only one SalesOrderDetail record. I’m not sure most of your readers will find that 1-to-1 relationship "obvious".

Until one shows proof that 2 queries are equivalent, any talk of "better way" is a moot point.

PS: Not sure what you were trying to achieve with the "Oracle people, who think that cursors are a good thing" comment. It just doesn't make you look very knowledgeable.

Subject: using correct datatypes...
Posted by: Ian Stirk (not signed in)
Posted on: Monday, August 13, 2012 at 4:46 AM
Message: Hi,

The following article tackles the problem at its source: http://www.sqlservercentral.com/articles/Admin/65138/

The utility described in the article will allow you to quickly identify which columns have mismatched datatypes across tables in a database, correcting these will improve performance, integrity and maintainability.

Additionally, you can use the following utility to search queries/plans that have implicit conversions:
http://www.sqlservercentral.com/articles/Performance+Tuning/66729/

Thanks
Ian

Reply

Subject: using correct datatypes...
Posted by: Ian Stirk (not signed in)
Posted on: Monday, August 13, 2012 at 6:16 AM
Message: Hi,

The following article tackles the problem at its source: http://www.sqlservercentral.com/articles/Admin/65138/

The utility described in the article will allow you to quickly identify which columns have mismatched datatypes across tables in a database, correcting these will improve performance, integrity and maintainability.

Additionally, you can use the following utility to search queries/plans that have implicit conversions:
http://www.sqlservercentral.com/articles/Performance+Tuning/66729/

Thanks
Ian

Reply

Subject: to waqas
Posted by: Grant Fritchey (view profile)
Posted on: Monday, August 20, 2012 at 11:02 AM
Message: Common table expressions (CTE) are an excellent tool for your queries. They're not magic (except with recursion) but instead are simply a different way of defining a derived table. Use them where appropriate, they're fine. I have seen them overused. Like anything, use the hammer to drive nails = good, smashing my toes = bad.

Compound indexes are another thing that are just fine but can be abused. Want to index on, oh, I don't know, zip code and eye color? Great if it helps the queries, but I probably wouldn't make eye color the first column since that's what drives the indexes statistics and shows how useful it might be for the query optimizer. See what I mean?

Subject: G
Posted by: Grant Fritchey (view profile)
Posted on: Monday, August 20, 2012 at 11:04 AM
Message: The data model is Adventureworks2012. I don't know how much real-world business functionality I'd derive from that.

Subject: Address Table & the View Example
Posted by: Grant Fritchey (view profile)
Posted on: Monday, August 20, 2012 at 11:11 AM
Message: Hey G,

The reason that you don't see the Address table in the first example is because the optimizer goes through a process that it calls simplification. That process can, and did in this case, eliminate tables that are not needed with a query. But, as was shown in this case, you can't rely on the simplification process as a perfect engine. It eliminated many, but not all of the unnecessary tables. To satisfy the simple query run against the view, all that was needed was a single table, but the optimizer didn't pick that out.

Subject: Seven Sins
Posted by: Bryant (view profile)
Posted on: Wednesday, August 29, 2012 at 6:14 AM
Message: Thanks for the distillation. I've learned most of these and tried to teach many. It is amazing how much performance you can gain in eliminating function calls and explicit conversions from WHERE clauses.

I might suggest you also take up OR in WHERE clauses as number 8. I have seriously improved many performance issues centered on an OR-induced table scan by UNIONing the individual conditions together. It doesn't work every time but it works more often than not.

As for the nested views, I have inherited some code that has that issue, specifically because of the hierarchical relationship of the underlying tables and the use of views to control security, filter soft-deleted rows and establish row counts to be passed upward. I looked into an indexed view but the underlying views are not deterministic.

The upshot here is to plan ahead and not fall into the trap of committing yourself to using views for everything because they are also used for security purposes on data exposed to users for manipulation. You will end up with unwieldy code that is poorly performing or a long project to extract the essence into better code.

Subject: Oracle People and RBAR
Posted by: chuckh1958 (view profile)
Posted on: Wednesday, August 29, 2012 at 7:51 AM
Message: I'm a SQL/Oracle DBA. That code would be bad in Oracle too! It would run much better by putting "WHERE Weight < 3" in the cursor defintion, but a single properly written UPDATE will perform much better in any RDBMS I've ever worked on (SQL, MySQL, and Oracle).

Regarding RBAR - you can make this example orders of magnitude worse by putting the row processing on the other end of a network connection. I've seen apps where millions of rows are processed this way introducing millions of network round trips with network latency accumulating on every single one. Its a performance nightmare and usually the same work can be done with a single round trip by using properly written DML.

Subject: RE: Or
Posted by: Grant Fritchey (view profile)
Posted on: Wednesday, September 05, 2012 at 9:40 AM
Message: Hi Bryant,

Excellent point. It's pretty cool how well you can improve performance by converting an OR into a UNION. I wish I'd added it in now. Thanks for the feedback.

Subject: RE: Network REBAR
Posted by: Grant Fritchey (view profile)
Posted on: Wednesday, September 05, 2012 at 9:41 AM
Message: Oh yeah. I've seen that too. Scary stuff. Thanks for the feedback.

Subject: CONTEXT_INFO
Posted by: b83thrapp (view profile)
Posted on: Tuesday, February 12, 2013 at 3:30 PM
Message: To simplify my maintenance when code changes are required I have reverted to using views in concert with work tables (not temporary) and CONTEXT_INFO. Standard edition of SQL Server (2008 from 2005).

I populate the work tables with parameter values supplied to the stored procedure, then join the work tables with CONTEXT_INFO and finally the desired tables. The views are then able to work with a minimal number of rows without slowing down to much.

This has simplified my coding significantly enough that I'm not concerned with the decrease in performance. Some of these views are nested 4 or more deep.

I have not been able to find much about CONTEXT_INFO and performance, so I'm not sure exactly how much of a hit I have introduced. So far it has worked well from my perspective, but I'm always looking out for an alternative.

Subject: Great article
Posted by: Anilkumar P (view profile)
Posted on: Tuesday, October 29, 2013 at 7:20 AM
Message: Till now what ever work i did, i never thought of performance, but from now i will consider doing it.

 

Phil Factor
Searching for Strings in SQL Server Databases

Sometimes, you just want to do a search in a SQL Server database as if you were using a search engine like Google.... Read more...

 View the blog

Top Rated

SQL Server XML Questions You Were Too Shy To Ask
 Sometimes, XML seems a bewildering convention that offers solutions to problems that the average... Read more...

Continuous Delivery and the Database
 Continuous Delivery is fairly generally understood to be an effective way of tackling the problems of... Read more...

The SQL Server Sqlio Utility
 If, before deployment, you need to push the limits of your disk subsystem in order to determine whether... Read more...

The PoSh DBA - Reading and Filtering Errors
 DBAs regularly need to keep an eye on the error logs of all their SQL Servers, and the event logs of... Read more...

MySQL Compare: The Manual That Time Forgot, Part 1
 Although SQL Compare, for SQL Server, is one of Red Gate's best-known products, there are also 'sister'... Read more...

Most Viewed

Beginning SQL Server 2005 Reporting Services Part 1
 Steve Joubert begins an in-depth tour of SQL Server 2005 Reporting Services with a step-by-step guide... Read more...

Ten Common Database Design Mistakes
 If database design is done right, then the development, deployment and subsequent performance in... Read more...

SQL Server Index Basics
 Given the fundamental importance of indexes in databases, it always comes as a surprise how often the... Read more...

Reading and Writing Files in SQL Server using T-SQL
 SQL Server provides several "standard" techniques by which to read and write to files but, just... Read more...

Concatenating Row Values in Transact-SQL
 It is an interesting problem in Transact SQL, for which there are a number of solutions and... Read more...

Why Join

Over 400,000 Microsoft professionals subscribe to the Simple-Talk technical journal. Join today, it's fast, simple, free and secure.