Click here to monitor SSC

Simple-Talk columnist

Listing common SQL Code Smells.

Published 22 November 2010 12:16 am

Once you’ve done a number of SQL Code-reviews, you’ll know those signs in the code that all might not be well. These ’Code Smells’ are coding styles that don’t directly cause a bug, but are indicators that all is not well with the code. . Kent Beck and Massimo Arnoldi seem to have coined the phrase in the “OnceAndOnlyOnce” page of www.C2.com, where Kent also said that code “wants to be simple”. Bad Smells in Code was an essay by Kent Beck and Martin Fowler, published as Chapter 3 of the book ‘Refactoring: Improving the Design of Existing Code’ (ISBN 978-0201485677) Although there are generic code-smells, SQL has its own particular coding habits that will alert the programmer to the need to re-factor what has been written.

See Exploring Smelly Code   and Code Deodorants for Code Smells by Nick Harrison for a grounding in Code Smells in C#

I’ve always been tempted by the idea of automating a preliminary code-review for SQL. It would be so useful to trawl through code and pick up the various problems, much like the classic ‘Lint’ did for C, and how the Code Metrics plug-in for .NET Reflector by Jonathan ‘Peli’ de Halleux is used for finding Code Smells in .NET code. The problem is that few of the standard procedural code smells are relevant to SQL, and we need an agreed list of code smells. Merrilll Aldrich made a grand start last year in his blog Top 10 T-SQL Code Smells.However, I’d like to make a start by discovering if there is a general opinion amongst Database developers what the most important SQL Smells are.

One can be a bit defensive about code smells. I will cheerfully write very long stored procedures, even though they are frowned on. I’ll use dynamic SQL occasionally. You can only use them as an aid for your own judgment and it is fine to ‘sign them off’ as being appropriate in particular circumstances. Also, whole classes of ‘code smells’ may be irrelevant for a particular database. The use of proprietary SQL, for example, is only a ‘code smell’ if there is a chance that the database will have to be ported to another RDBMS. The use of dynamic SQL is a risk only with certain security models. As the saying goes,  a CodeSmell is a hint of possible bad practice to a pragmatist, but a sure sign of bad practice to a purist.

Plamen Ratchev’s wonderful article Ten Common SQL Programming Mistakes lists some of these ‘code smells’ along with out-and-out mistakes, but there are more. The use of nested transactions, for example, isn’t entirely incorrect, even though the database engine ignores all but the outermost: but it does flag up the possibility that the programmer thinks that nested transactions are supported.

If anything requires some sort of general agreement, the definition of code smells is one. I’m therefore going to make this Blog ‘dynamic, in that, if anyone twitters a suggestion with a #SQLCodeSmells tag (or sends me a twitter) I’ll update the list here. If you add a comment to the blog with a suggestion of what should be added or removed, I’ll do my best to oblige. In other words, I’ll try to keep this blog up to date. The name against each ‘smell’ is the name of the person who Twittered me, commented about or who has written about the ‘smell’. it does not imply that they were the first ever to think of the smell!

  • Use of deprecated syntax such as *= – see SR0010: Avoid using deprecated syntax when you join tables or views  (Dave Howard)
  • Denormalisation that requires the shredding of the contents of columns. (Merrill Aldrich)
  • Contrived interfaces
  • Using VARCHAR and VARBINARY for datatypes that will be very small (size 1 or 2) and consistent – see SR0009: Avoid using types of variable length that are size 1 or 2
  • Use of deprecated datatypes such as TEXT/NTEXT (Dave Howard)
  • Datatype mis-matches in predicates that rely on implicit conversion. see SR0014: Data loss might occur when casting from {Type1} to {Type2} (Plamen Ratchev)
  • Using Correlated subqueries instead of a join   (Dave_Levy/ Plamen Ratchev)
  • The use of Hints in queries, especially NOLOCK (Dave Howard /Mike Reigler)
  • Few or No comments.
  • Use of functions in a WHERE clause. (Anil Das)
  • Overuse of scalar UDFs (Dave Howard, Plamen Ratchev)
  • Excessive ‘overloading’ of routines.
  • The use of Exec xp_cmdShell (Merrill Aldrich)
  • Excessive use of brackets. (Dave Levy)
  • Lack of the use of a semicolon to terminate statements
  • Use of non-SARGable functions on indexed columns in predicates (Plamen Ratchev)
  • Duplicated code, or strikingly similar code.
  • Misuse of SELECT *  -see SR0001: Avoid SELECT * in stored procedures, views and table-valued functions  (Plamen Ratchev)
  • Overuse of Cursors (Everyone. Special mention to Dave Levy & Adrian Hills)
  • Overuse of CLR routines when not necessary (Sam Stange)
  • Same column name in different tables with different datatypes. (Ian Stirk)
  • Use of ‘broken’ functions such as ‘ISNUMERIC’ without additional checks.
  • Excessive use of the WHILE loop (Merrill Aldrich)
  • INSERT … EXEC (Merrill Aldrich)
  • The use of stored procedures where a view is sufficient (Merrill Aldrich)
  • Not using two-part object names (Merrill Aldrich)
  • Using INSERT INTO without specifying the columns and their order (Merrill Aldrich)
  • Full outer joins even when they are not needed. (Plamen Ratchev)
  • Huge stored procedures (hundreds/thousands of lines).
  • Stored procedures that can produce different columns, or order of columns in their results, depending on the inputs.
  • Code that is never used.
  • Complex and nested conditionals
  • WHILE (not done) loops without an error exit.
  • Variable name same as the Datatype
  • Vague identifiers.
  • Storing complex data  or list in a character map, bitmap or XML field
  • User procedures with sp_ prefix (Aaron Bertrand)
  • Views that reference views that reference views that reference views (Aaron Bertrand)
  • Inappropriate use of sql_variant (Neil Hambly)
  • Errors with identity scope using SCOPE_IDENTITY @@IDENTITY or IDENT_CURRENT- see SR0008: Consider using SCOPE_IDENTITY instead of @@IDENTITY  (Neil Hambly, Aaron Bertrand)
  • Schemas that involve multiple dated copies of the same table instead of partitions (Matt Whitfield-Atlantis UK)
  • Scalar UDFs that do data lookups (poor man’s join) (Matt Whitfield-Atlantis UK)
  • Code that allows SQL Injection (Mladen Prajdic)
  • Tables without clustered indexes (Matt Whitfield-Atlantis UK)
  • Use of “SELECT DISTINCT” to mask a join problem (Nick Harrison)
  • Multiple stored procedures with nearly identical implementation. (Nick Harrison)
  • Excessive column aliasing may point to a problem or it could be a mapping implementation. (Nick Harrison)
  • Joining “too many” tables in a query. (Nick Harrison)
  • Stored procedure returning more than one record set. (Nick Harrison)
  • A NOT LIKE condition (Nick Harrison)
  • Not setting an output parameter for all code paths through a stored procedure, see SR0013: Output parameter (parameter) is not populated in all code paths
  • excessive “OR” conditions. (Nick Harrison)
  • User procedures with sp_ prefix (Aaron Bertrand)
  • Views that reference views that reference views that reference views (Aaron Bertrand)
  • sp_OACreate or anything related to it (Bill Fellows)
  • Prefixing names with tbl_, vw_, fn_, and usp_ (‘tibbling’) (Jeremiah Peschka)
  • Aliases that go a,b,c,d,e… (Dave Levy/Diane McNurlan)
  • Overweight Queries (e.g. 4 inner joins, 8 left joins, 4 derived tables, 10 subqueries, 8 clustered GUIDs, 2 UDFs, 6 case statements = 1 query) (Robert L Davis)
  • Order by 3,2 (Dave Levy)
  • MultiStatement Table functions which are then filtered ‘Sel * from Udf() where Udf.Col = Something’ (Dave Ballantyne)
  • running a SQL 2008 system in SQL 2000 compatibility mode(John Stafford)

51 Responses to “Listing common SQL Code Smells.”

  1. JasonRMerrill says:

    Use of reserved words as column names, parameters, variables, etc.

  2. beechgrovejoe says:

    Sounds like the same old clap trap that comes up every few years trying to make software development into a branch of engineering. While some of the items do make sense. Most are nothing more than some developer’s personal style that they want to force on other developers.

    Time and time again, if you measure developer productivity and correctness of implemention before and after rules like these are enforced, you find that nothing has changed.

    I would be VERY interested to see how each of these items is justified. Most of the time, the arguments are basically “Coders are stupid” so lets protect them from themselves.

    If you think these kinds of rules are merited, why don’t you get your coders the training they need?

  3. Celko says:

    I look for singular table names. This is a sign that they are not thinking in sets, but still want to do record-at-a-time processing. When you say “Personnel”, you have a collective abstraction in your head. When you say “Employees”, you are looking at the trees and not the forest. When you say “Employee”, you missed the whole point of set-oriented processing.

  4. BuggyFunBunny says:

    @beechgrovejoe:

    While the art vs. science argument has gone on forever, where those on the anti-engineering side assert that each coder is Picasso, we do generally call the practice Software Engineering. Putting some engineering into the process will help. There is a reason that, and rightly so, software is generally buggy (pardon the self reference); either we go find a million Picasso’s or we instil engineering practice. This is as good as any to start.

  5. beechgrovejoe says:

    @BuggyFunBunny

    Some coders are Picasso, other are not. But trying to make poor coders into Picasso with something as simple as a set of coding rules has never worked.

    A better approach has always to been to teach young coders why they need to do some things that may seem worthless by nonetheless lead to better quality code.

    The best way to get coders to write good code is to make them understand why good coding practices are in THEIR best interest. If they do it to make their lives easier you don’t need the rules. But if try to force it down their throat with mandates from above, all you do is alienate the coders.

    On many a project I have seen coders conform to the arbitrary rule set only to write poor, hard to maintain code. But it meets the standards. That’s not what we want.

  6. Phil Factor says:

    @BeechGroveJoe    I have great sympathy for your viewpoint, but we’re not talkiing about rules or ‘best practices’ here, but purely about indications that some code needs to be refactoring. In the rush to meet a deadline, I think I’ve done almost everything in the list, but then go back and heal it later.

    If I were to write a piece of code (and I’m tempted), that shows you any inconsistencies in the datatype assigned to columns or local variables with the same name, then I wouldn’t be showing you a rule that has been broken, or a Mosaic law broken, but an indication of places you need to check to make sure there isn’t a possible bugbear tucked away lying in wait.  It is just a whiff of smoke, but no flames.

  7. craig.wagner says:

    The problem with a long bulleted list is it doesn’t help anyone understand why these things are bad, what to do about them, or alternatives. Our DBA says we should use NOLOCK hints on our queries. You have a bullet point that says not to. He says it’s to prevent deadlock problems. You don’t tell me why they’re bad. Makes it tough to sell this to my DBA.

  8. nick harrison says:

    This blog post provides a little more background on why nolock is considered a smell http://pentonizer.com/sql-server/to-nolock-or-not/. Basically any hint indicates that something may not be right. In general when you use a hint, it means that the optimizer could not figure out the best way to run the query and you had help it out.

    If your query is so complex that the optimizer could not figure out how to run it correctly than it is too complex and it needs to be simplified.

    Also since you hints may override what the optimizer is doing on the assumption that you know better than the optimizer, you should rigorously test these queries with every upgrade, service pack, patch, etc just in case the next version comes up with a better way.

    There is also a chance that your “better way” may not be better as the database evolves. Remember the optimizer takes into account current statistics and can produce execution plans based on the current state of the database not just the snapshot at the point in time when you “optimized” the query.

  9. GilaMonster says:

    Craig: Nolock means to SQL Server ‘get me the data even if someone else is modifying it. I don’t mind if it’s slightly wrong’

    With nolock you are trading accuracy for speed. You won’t be blocked (and deadlocks are unlikely) but you could get incorrect results. That means data that is changed and then rolled back, so you see a change that was never committed. It also means potentially missing entire rows of data or reading portions of the table multiple times.

    I agree with Phil, if I’m reviewing database code and I see NoLock everywhere, it means that the people who wrote that don’t understand what it means (or it would not be everywhere) and they’re likely using it to hide the symptoms of poorly written code and missing indexes.

  10. beechgrovejoe says:

    The comments about nolock are a perfect example of what I am talking about. To make a blanket statement that nolock is bad or should looked at as somehow not optimal is glosssing over the realities of modern databases.

    Take for example a case where one process inserts records into a database that are never changed again. Separate processes that read those unchanging records NEVER need any locking and making a rule that nolock cannot be used simply places performance limits and overhead requirements on a database that does nothing but make the writer of the rule feel good.

    Teaching coders when nolock is appropriate and when it is not is MUCH more valuable than any assursions that nolock is to be frowned upon.

  11. gsuttie says:

    whats wrong with this ? -Prefixing names with tbl_, vw_, fn_, and usp_

    we use usp_ for sprcs and v_ vor views and t_ for tables and I find it a million times nicer to have t_user rather than just user.

  12. Jeremiah Peschka says:

    @gsuttie – it’s code funk. If you refactor a table to a view or a view to a table, you need to change a lot of code. Likewise if you mess around with the underlying type of any object. Ultimately, the way data is accessed or stored shouldn’t make any difference. Prefixing object names with their type weaves brittleness into the physical implementation of the system. It’s not necessarily a bad thing (almost nothing is inherently bad) but it’s a sign that I need to watch out for things.

    Would you be horrified if you found C# code with obj_ str_ or int_ littered throughout the code? I’ve written code like that before and I find it very tiring to maintain that code… I usually end up re-writing it as quickly as I can.

  13. beechgrovejoe says:

    Phil Factor,

    So you are suggesting that because the original development process was broken we need some automated tool to find out where the code was screwed up.

    I would suggest that a complete code reveiw is what is merited and attempting to use some set of “smells” to get out of the code review is just as bad as the lack of review in the original development.

    Coders trying to get out of code reviews is a bad thing no matter how they justify not doing it.

  14. BuggyFunBunny says:

    @beechgrovjoe:
    Teaching coders when nolock is appropriate and when it is not is MUCH more valuable than any assursions that nolock is to be frowned upon.

    But, but, but…. The use of nolock is application (even application/table) specific. There’s no *general rule* of appropriateness. It’s not a rule with engineering semantics, i.e. an analog to the third law of thermodynamics, or even established human rules, i.e. GAAP in accounting. It’s not even consistent across engines, i.e. in DB2/LUW, the engine will attempt to upgrade a UR row in a result set to exclusive in order to make an update. The use of nolock is dependent on the particular application accessing the database (not even the database per se), and the nature of concurrency control in that engine.

    As to nolock/UR in general, it fully depends on the frequency of rollback and the data being processed. If rollback isn’t used programmatically (modulo crash), then uncommitted reads are fine. Attempts to update said rows will fail, but the same happens under MVCC semantics. Locking late, in other words, is better in such a circumstance.

    In the OO world, particularly dynamic languages, there exists the mantra “fail early”; the notion being to avoid undoing/redoing work. The database world may, or may not, be equivalent. Locker databases, with Share lock semantics, fail early; MVCC databases, with Read Last Committed semantics, fail late. To the extent that non-Oracle databases are sliding toward MVCC-lite may or may not be a Good Thing. The code smells of locker databases differ in some measure from those of MVCC databases; the concurrency semantics are just different.

  15. JamesKn says:

    Would be useful to have a tool like FXcop for SQL developers i.e SQLCop . You could then use it like .NET devs do take a little bit of it or discard some of the rules when not correct for the situation. It would though help people think about standards but also promote better standards and make it a standard process for SQL development. Rather than after the fact tool. Fxcop also educates developers on why X rule is important so you can generally read up on if you want to enforce that rule or discard it. ;) Phil fancy asking Redgate if they want to build one?

  16. beechgrovejoe says:

    @BuggyFunBunny

    I agree. But then again, what you are saying only goes to bolster the argument that a list of good and bad coding pactices is counter productive.

    My concern is that relying on arbitrary lists of practices changes the code’rs focus from writing correct programs to following the arbitrary list. Many times this is just an exuse to not spend the time checking the things that are NOT on the list.

    When you are living in a high pressure environment with tight deadlines, things like coding practices mearly add an excuse to not perform the due diligence that is needed by a programming staff.

  17. maheshc says:

    Opening a writable cursor where a read-only cursor would do.

  18. Alex_Kuznetsov says:

    I agree with beechgrovejoe’s statement:
    “A better approach has always to been to teach young coders why they need to do some things that may seem worthless by nonetheless lead to better quality code.”

    Every item on this list needs a detailed explanation why it may indicate problems. Some items are not clear to me. What do the following statements actually mean:
    # Contrived interfaces
    # Excessive ‘overloading’ of routines.
    # Excessive use of brackets. (Dave Levy)

    Some items are probably debatable; to me they look more like personal opinions than general rules, for example:

    # Stored procedure returning more than one record set. (Nick Harrison)
    (What if we need time-consistent snapshot of data from several tables as of some time?)

    # The use of stored procedures where a view is sufficient (Merrill Aldrich)
    (a view may be sufficient today; the situation may change tomorrow. We may anticipate such a change.)

    # A NOT LIKE condition (Nick Harrison)
    (what’s wrong with that? Yes, we all know that it is not index-friendly.)

    I think there are situations when it is preferable to use each of these three approaches; they should not be considered smells as a blanket statement.

  19. nick harrison says:

    Phil makes the point in the blog and it is a valid point that Code Smells are guidelines: “You can only use them as an aid for your own judgment and it is fine to ‘sign them off’ as being appropriate in particular circumstances”.

    I try to make the point repeatedly in my article, that every smell does not dictate that there is a problem. It indicates that there may be a problem. If you have code that must include any given smell, you should also include a comment explaining the need for the smell.

    A code smell indicates that there is something strange there that warrants a closer look. If you gave it a closer look and you are still convinced that this is the way to go, add a comment explaining some of your thought process.

    The example of returning multiple result sets because you need time consistent snapshot of data is a very good example of why returning multiple result sets is valid and not problematic.

    When you find smelly code, investigate it. It may be perfectly valid, if it is add a comment and move on. If it is not valid, but you don’t have time to correct it, add a comment explaining how it should be changed and move on.

    In the spirit of smells, these should not be viewed as pedantic dogmatic rules. They are guidelines with many exceptions, but still always worth a closer look.

  20. beechgrovejoe says:

    @nick harrison

    The reason they are called standards is that when we call them guidelines NOBODY follows them. In fact if they are only guidelines why should we waist time looking for problems that don’t exist? If the code works it works. I have never seen an end user that thinks that a progam is less usable because the code is not pretty.

    Searching around for code that smells bad seems to me to be just trying to invent busy work for coders.

  21. Phil Factor says:

    @Alex K

    Having assembled a list from various sources, the next stage is to amplify on the explanation of what each ‘smell’ is,  and try to define it. I’d also like to put them into one of the voting systems and get an opinion as to which are good indicative signs and which aren’t. At the moment I’ve just taken all the suggestions and haven’t implied which have the most merit. I’d much rather get a consensus view.

    It may be that the best approach to understanding each ‘smell’  is to define the metrics (the way of measuring it)  Some of the classical ‘smells’ that I added in may prove to be irrelevant or too vague to measure.

    I have a feeling that we should use as broad a spectrum of ‘smells’ or indicators as possible. it would be quite wrong to flag up code as being suspect just based on a few minor ‘smells such as missing out semicolons, ‘tibbling’,  or overdoing brackets in expressions. This is why I’m spreading the net wide to try to catch a range of opinions as to what constitutes a ‘smell’. I’d argue that using oaCreate shows ingenuity and initiative, for example, but I know I’ll be outvoted!”

  22. BuggyFunBunny says:

    @beechgrovejoe:
    If the code works it works. I have never seen an end user that thinks that a progam is less usable because the code is not pretty.

    The purpose of smell tests is to reduce the volume of spaghetti mess code which is the bane of Software, and which leads to expensive, often slow, unwieldy applications. There’s a reason Dijkstra lambasted GOTO; it smelled bad. Just because some code “works” doesn’t mean it should continue to exist; they could have built the Golden Gate Bridge out of popsicle sticks (don’t know the East of the Pond name). It might have stood for a while, but it wouldn’t be good civil engineering practice to build it such. There’s the canard: “if they built skyscrapers the way software is built, a woodpecker could destroy London” (the original quote is from Weinberg). So, let’s find a way to put more engineering into Software Engineering, not less.

    What we call the identification of bad database design and execution is of less importance than the act of identification; emphasizing “code” over “design” (or DML over DDL, as CELKO continues to assert) is a smell in and of itself, since code attacks through procedure and the RM is declarative. Phil chose the canonical Code Smell; that’s close enough. Whether these are identified in a code review, or shop standards, or university curriculum is of lesser importance.

    As to users, well, yeah, they will notice if the application works, but poorly or if it works but can’t be modified except by expending a dog’s year of effort. This last is the base reason the PC and Lotus 1-2-3 stole so much of our candy; users took the opportunity to DIY and thus redefine the entirety of data as The Spreadsheet Paradigm. We all rue that day; or should anyway.

  23. beechgrovejoe says:

    @BuggyFunBunny,

    So why is spaghetti code a problem in code that is not going to change? And if it’s going to be changed, any problems can be addressed then. So why would you want to go looking for code to change when there is no plan to change it?

    Explain what is the BUSINESS need for code that looks pretty other than to jack up billable hours to the companies that are paying for these programmers. The old saying was “If it aint broke don’t fix it”. Searching for bad smelling code appears to be looking for code to fix that isn’t broken. Because if it’s broken we will focus on it anyway.

    Just because YOU think a program that is working well should be rewritten doesn’t mean there is a business argument to rewrite it. Furthermore you most likely where not on the project that made the code in the first place so you are not really in a position to claim they did something wrong. If you think they did something wrong, you are just showing that you do not have a full understanding of the circumstances and contraints the original project was under.

    Software development has always been a pay me now or pay me later proposition. If you want to cut corners in the initial development, you will pay for it later in maintenance.

    As far as user DIY; wonderful. The US is short nearly 1 million coders per year for projects that could be funded right now. So if regular people want to learn and do it themselves cool. That’s one more thing I don’t have to deal with.

  24. SQLGuyChuck says:

    2 more for you. This is awesome list that I have always wanted!
    Select into statements, which put a schema lock on db.
    Order by ordinal position which is deprecated in next release of Sql.

  25. adlan says:

    I take @beechgrovejoe’s point about blindly following supposed ‘Best Practice’, but I think we need to remember that ‘code smells’ are a tool, and like any other tool they can be misused. If you take the code smells and use them to identify potentially problematic code, then assess each and if it is a problem, educate the author of the code on why it is a bad idea to use a particular technique in a particular situation then it could be a useful exercise. Obviously if you blindly apply them without understanding why they are potential problems, you may end up with worse code, pissed off developers or both.

    I have another suggestion.
    use of subqueries as part of a select clause.

  26. pjones says:

    [quote]So why is spaghetti code a problem in code that is not going to change?[/Qoute]
    Beechgrove Joe has obviously never had the “query from hell” that eats up all the cpu on a server leading to a major slowdown and user complaints and is caused by smelly code – views referencing views referencing views was the one I found but many of the others listed above can cause the same effect in a production environment.

  27. Patrick Index says:

    Tables without primary keys (which should clearly have them).

  28. Patrick Index says:

    ..should be banned.

  29. Patrick Index says:

    I meant nested views… should be banned.

  30. beechgrovejoe says:

    @pjones

    After 30 years of code development experience I have seen just about everything the IT industry has to offer.

    You seem to be confusing pretty code with poorly performing code. If you need a smell checker to confirm what your user base has been complaining about, I would suggest the smell you are looking for is comming from the management that let that code get into production and kept those issues from making to the desk of someone who can take care of the problem.

    If you really want to eliminate the problem with views, ban them. I have yet to see a justification for any use of views. Unless of course you have a development staff that is too stupid to write properly functioning queries in the first place. In which case training will be much more effective.

  31. BuggyFunBunny says:

    @beechgrovejoe:
    You seem to be confusing pretty code with poorly performing code.

    I was going to opt out of further contributions to this thread, but this can’t go unremarked. The point of code smells is NOT *pretty code*. Not at all. If that’s what you’ve gotten from this, you’ve missed the point. The point is *bad code* that smells bad. Just because code compiles/runs does not make it good code. The point of smells is to identify syntax which is known to perform poorly, get rid of it, and replace with better performing syntax. Is that not obvious?

  32. beechgrovejoe says:

    @BuggyFunBunny

    What you don’t seem to get is that code that passes all of these smell tests ALSO can perform badly. Focusing on some arbitrary list of rules changes the emphasis from making code the performs it’s task correctly (what the user needs) to making code that coders can feel good about.

    If someone cannot take a look at poorly performing code and locate the problem, then they are not qualified to be doing the job. Relying, on code smell tests, is just a poor excuse for hiring incompetent people or not providing the training they should have gotten in the first place.

    There is a least a 20 year track record of things like smell tests being actually worthless to the real development task of writing correctly functioning code and reducting development costs. In fact time and time again it has been shown that code metrics have actually only served to raise development costs with no real increase in code quality.

  33. Felbane says:

    So what I gather from all of this:

    1) Code smells in general are used as part of a code review process to quickly identify potential problem areas, determine if there’s truly broken or poorly-performing code, and either fix it or comment/notate why the code is written that way.

    2) The list of smells really needs some elaboration with regard to what people should be looking for, specifically, and why that particular smell indicates a potential issue.

    BeechGroveJoe seems to think that this list is a set of rules by which you say “if X exists in your code, GET IT OUT, it’s wrong”… and I don’t think that’s what the intent is. Whether or not it’s explicitly named as such, coders use ‘smells’ to search out poorly-performing or broken code every day… does anyone really go through each line of code one-by-one? I think BuggyFunBunny has it half right: you identify syntax that is generally accepted to perform poorly, and either fix it *or* annotate the code with why it’s done that way.

    To the comment about “racking up billable hours”, I’ve seen firsthand where poorly-written code (written by otherwise fantastic developers in the middle of an 80-hour-week release crunch) has led to much more time wasted later tracking down the issue than it would take to run a “smell test” against the code. Applying the above ruleset would have been like hanging a giant neon “fix me” sign to the chunk of code that was causing the problem.

    My guess is that one or more of these “potential trouble” items is something that BGJ has utilized in the past (or continues to use) and was admonished for, and is now bitterly crusading against ‘standards’ and ‘smells’ with the devotion of a zealot. Re-read the article. It’s not a list of “things never to do”, it’s a list of “make sure these things are right for your situation”.

  34. beechgrovejoe says:

    @Felbane

    So if these kinds of code metrics are of any use, why are they not commonplace already? All the technology to do these kinds of things have been around for more than 20 years now. Years ago I used to write these kinds of tools for big projects with more than 200 coders. We used them for a few years, but they fell out of use because they simply where not very effective at helping the development effort.

    Sure they found all the things they where going after (basically a smells list), but that information is not all that useful. It just changes the focus of the development staff without really helping or hurting the effort.

    What your basically arguing is that a tool that say “hey look a nolock” is somehow different than a programmer reading “nolock” out of the query themselves. You seem to want to give your development staff some justification for glossing over issues that aren’t caught by your smell checker.

    The reality is that code smell tests have been tried in the past and have always failed. If you want to build one go right ahead. Your not going to get any different results. Why do think there is not one on the market right now?

  35. Felbane says:

    @beechgrovejoe

    Having an additional tool in the arsenal is never a bad thing. I’m having trouble nailing down why you’re so against this idea. Just because a particular tool didn’t work for you in the past doesn’t mean it doesn’t have value to anyone else.

    I can say with certainty that if we had had a t-sql ‘smell checker’ in the past, there would have been considerable time savings when tracking down several performance issues. From your original comment, you may argue that the bad code should have never been there in the first place, and that we should have never hired the insignificant pissant that wrote it, but that doesn’t change the fact that it was there and the smell test would have put a spotlight on it. Even rockstar developers such as yourself make mistakes.

    Fixing the problem before it goes production is infinitely better than fixing the problem after release no matter if it’s a smell test, a developer, or an omniscient robot from the future that catches it.

  36. BuggyFunBunny says:

    @beechgrovejoe:
    Why do think there is not one on the market right now?

    The simple answer: code smells are seen in the syntax (which is _correct_ in that the sql engine doesn’t barf), but the smell is a semantic error (potentially) in the context of some (part of an) application. Thus, to process a code smell, a human has to intervene.

    So far as building one, well all that need happen is an export of the DDL/DML from the catalog and from the application source to an elaborate awk script. No, I’m not volunteering to write the script.

    To take two smells from the list which are different in kind and at opposite ends of the spectrum:
    Use of “SELECT DISTINCT” to mask a join problem (Nick Harrison)
    Multiple stored procedures with nearly identical implementation. (Nick Harrison)

    The first smell could be identified by the awk script with ease, but second likely not, a canned tool/program would have to be a significant advance in AI technology; small market, too.

  37. beechgrovejoe says:

    @Felbane

    It’s not any perticular tool. I have evaluate literal hundreds of them. The reality is that it’s not mathmatically possible to make the tool you are looking for. Sure you can build one that deals with the partcular mistakes of a single coder on a specific project. But making the general tool that works for a small group of coders in even a small number of projects it simply not possible.

    The only people that advocate such a tool are generally people who have never tried to build one, or simply haven’t been in the industry long enough to understand the futility of such a tool.

    It really boils down to a variant on the halting problem or the P=NP debate. You simply cannot write a tool to locate recurive sets (which are what you really need to be looking for) because it’s simply proven impossible to do.

    This is not like some new smart coder is going to come along and invent a wonderful new program. Proven inpossible means it will NEVER be done. No matter how smart some coder believes they are.

    But that doesn’t stop the nieve from comming along every few years and deluding themselves into believing THEY can write the tool that nobody else can write. What they don’t understand is that they can’t because it’s impossible. Otherwise we would have had one LONG ago.

    It doesn’t matter how good you think you are in IT. You are not going to get around the contraints imposed be CS. It remindes me of the people that told us we would have a computer that work pass the Turring test in 20 years, back in 40′s and 50′s. I’m still waiting ………

  38. beechgrovejoe says:

    @BuggyFunBunny

    OK, there are people trying to sell these kinds of programs, however the vast majority end up in open source because they cannot generate enough revenue. The rest are sold to big companies where some idiot mandates them but they fall out of favor after a few years because the costs are high and boing for the buck is small.

    While it might be all well and good writting code to catch problems with DISTINCT and joins, it’s kind of pointless if nobody ever writes that kind of code. If one of your coders is doing that and you need a tool to find problems like this, I got to wonder why people didn’t know the coder was so bad. I would suggest they you have a major managment problem and until you fire that manager, no tool is ever going to address the problem.

    At least you realize that some the smells are impossible to find. For the ones that are possible to find, you are simply underestimating the complexity of the problem you want to undertake. You also don’t seem to appreciate the pissing contest you are going to create among your coders.

    The young dummys are doing it for the glory so they will use the tool to try and make themselves look more impressive. The older coders are doing it for the money and they don’t want to waste time doing things that doesn’t really make their lives any easier.

    You have to go a lot further than an assursion that a smell checker is useful before you even consider building one. No one seems to be providing any justification for the smell checker beyond “wouldn’t it be nice if we had one”. Sure it would be nice. But didn’t you think that people have wanted something like this for over 50 years now?

    The only people that advocate programs like a smell checker are just not all the familure with the history of the CS, IT, and AI fields. But if you want to build one, go ahead and try. I need a good laugh about now :-)

  39. merrillaldrich says:

    Phil – thanks for the mention; I am honored. The screed of comments here is unfortunate, but that’s teh internetz I guess. We can’t pick our comments.

    I would like to reiterate one point, though you made it and it has been said several times: these are not hard-and-fast rules. These are intended as a guide for less experienced devs to help them avoid design problems. That’s all. It’s not a panacea, it won’t cure all problems, and yes, beechgrovejoe, experienced people should know these things already. It’s not a substitute for good design or good code, and it will not automagically cure those things. Nobody claimed it would. That’s not the point.

  40. beechgrovejoe says:

    @merrillaldrich

    So the real purpose is to guide young coders. Good. But wouldn’t they be better servered with a simple document that describes each of the smells and provides good justification for each of these rules that are not rules?

    Having a tool that dings them every time they make a choice to do somthing not allowed by the smell test seems to be dinging them just to be dining them. It seems to be creating an adversarial process where they have to justify themselves because some bleeping tool threw up a red flag. Especially in cases where the program functions correctly and performs to acceptable levels.

    There is no better way to hurt developer productivity than to make them justify little insignificant details. You make them feel like they are not trusted and they DO resent it. And this after all is the crux of my position.

    Using a tool like this sends a clear message to coders that they are not adequate for the job. In addition unless you are going to provide training that includes justifications for your smells, they are going to consider your tool the smell.

  41. Celko says:

    >> hats wrong with this ? -Prefixing names with tbl_, vw_, fn_, and usp_ <<

    Besides violating the ISO-11179 rules for data element names, it shows a bad mindset. This tells me how something is currently implemented; I want to know what it means in the data model. Since we have only one data structure in SQL, the “tbl_” prefix is as redundant and confusing as putting “noun_” in front of very noun in the documentation.

    And I still read “vw_” as the attribute role of a Volkswagen.

    This convention goes back to the early versions of BASIC that had to have a $ on the names of strings for the interpreter. T-SQL with its @, @@, # and ## shows that it is a one-pass compiler that needs this same kind of help.

  42. Alex_Kuznetsov says:

    @beechgrovejoe “Using a tool like this sends a clear message to coders that they are not adequate for the job”? Really?

    C# and Visual Studio are tools that issue compiler warning when they detect smells. Many teams, including our, treat warnings and errors – if we have warnings, our code wont compile. All this early detection of problems makes us more productive, also we buy and use Resharper, which detects more smells for us.

    IMO this is a more productive approach than blaming the mirror – commonly accepted smells are usually just a surface reflection of underlying problems.

    Of course, some of the smells listed here are controversial or nor clearly defined and not commonly accepted.

  43. Alex_Kuznetsov says:

    @Phil I am very curious to see more detailed explanations of smells with examples showing when and why they indicate problems.

  44. beechgrovejoe says:

    @Alex_Kuznetsov

    Well Im sorry you work for such a poor company that threats you so badly.

    Treating warnings as error is just as misguided. They are WARNINGS, not something that stops the program from functioning. And yes we know your opinion. So please tell me, how many end users have complained that the program works fine but they don’t like the smells in the code. The only reason to be looking for smells is that you want to jack up billable hours because you can get some idiot that doesn’t know any better to pay for it.

    The reason they are warning and not errors is that they are NOT commonly accepted as problems. For example, warnings about unused variables or dead code. You can have tons of it and the program still functions properly. Only if you are living in a world where your poor compiler’s optimizer cannot remove dead code and unused variables is it even justifiable to worry about this. We ain’t using 1980′s C compilers anymore. Maybe you should get Team Edition, turn on all the meaningless code metrics and take even more time developing your apps.

    Maybe you should find the inexperienced ass that wants you to write pretty code, fire him, stop waisting time looking for problems that don’t exist, and get better developer productivity. Or are you just developing code to glorify yourself and the to hell with the end users?

  45. Sean Redmond says:

    Hi everybody,
    I have a few questions on the smells above.

    • The use of stored procedures where a view is sufficient (Merrill Aldrich)
    One needs to define ‘sufficient’. Is the use of SPs rather than views when an application accesses the DB sufficient?

    • Not using two-part object names (Merrill Aldrich)
    We have several DBs that need to be separated for legal reasons. I use linked servers to access data from one or the other, when data from one or the other is needed. I could use replication, I suppose, but then I would have to check if the overhead imposed with transactional replication is worthwhile.

    • Aliases that go a,b,c,d,e… (Dave Levy/Diane McNurlan)
    This, surely, is a matter of style. e.xxx is a field in the fourth joined table. Some may prefer ih as an alias to ‘InventoryHeader’, others may always use ‘i’ and others ‘invh’. I look at the SQL instead of assuming that invh.FieldName is in the InventoryHeader table.

    • A NOT LIKE condition (Nick Harrison)
    What should one do when one wishes to find all items in a column except those with a specified value somewhere within the text? For example, give me a list of all given girls’ given-names and omit those which have the name ‘Ann’ in them (e.g. Ann, Anne, Pollyann, Roseanne, etc.).

    • Lack of the use of a semicolon to terminate statements
    Would someone explain to me why this is bad please? I’m used to semi-colons in Pascal, Javascript and the like, but not in SQL.

  46. nick harrison says:

    I have started postings to help shed some light the problems with some of these smells.

    I have started at the top and working my down, I have shared my thoughts on the first four smells. Over time, my plan is to flush out details for all of these smells.

  47. gsuttie says:

    Couldnt agrre with thiis Prefixing names with tbl_, vw_, fn_, and usp_

    We use t_ for tables usp_for stored procs, v_ for views.
    Say we have a user table:-

    Table:- t_User
    Sprocs:- usp_t_user_select, usp_t_user_insert, usp_t_user_delete, usp_t_user_update

    Other databases I see have:-

    Table:- Customers
    Sprocs:- FindCustomers or GetCustomers

    In any kind of code I can quickly see/eyeball that usp_t_user_select is a stored procedure which selects from the table t_user, whereas the other db’s I see you cant quickly tell what kind of object they are because of the naming conventions used.

    I didnt think the naming convention using t_ and usp_ was nice but now I would argue its far more useful – espeically when scruipting objects and you have a list of scripts by object name – I can tell whats a sproc or a table from windows explorer, but I couldnt from the other examples you see like Northwind.

    If your calling that a code smell then I would disagree entirely.

  48. Jeff Moden says:

    BWAA_HAAA!!! Celko never misses a chance to rant about pluralized names. Singular table names are not any kind of indicator or “code smell” as to whether or not a Developer is a good set-based programmer than being bald qualifies you as a person with Celko’s knowledge about “Nested Sets”.

    Shifting gears… Although I don’t agree with a lot of the listed “code smells” in this fine article, I have to agree that people who are worried only with completion of code or supposed “Developer Productivity” may be missing the boat on a real opportunity to shine. “Developer Productivity” may actually slow down at first when such “code smells” are converted to enforced “rules”. That happens anytime when there’s a change in process but, once properly embraced, will actually improve “Developer Productivity” because it’s actually a back-handed form of training. What people don’t see when many of these “code smells” are enforced as “rules” is that the amount of rework plummets, code reviews go a whole lot faster, and the all important “customer” has a whole lot less complaints about accuracy and performance.

    That brings up another of my favorite subjects… Is “Developer Productivity” being measured correctly? Usually… NOT. “Code Complete” shouldn’t be the only measure of productivity but frequently is. There are other things I measure Developers by… Does the code pass peer and DBA review? Does it pass Unit, Integration, and Production tests? How much code comes back as a performance problem in the face of scalability? How much does it clog the “pipe”? How long does it take someone else to figure out their code when rework IS required which is also asking the question, is the code properly documented?

    The “code smells” themselves aren’t crap… they’re called “smells” because they’re very real and strong indicators of crap code that will crash and burn in the face of any real scalability… like at “month end” which is right when the “customer” needs good, strong, scalable code the most. How many times have we seen someone write a cursor to insert rows from one table to another? That’s crap code and I agree… it’s a matter of training.

    Training? Everybody wants training on someone else’s Nickel and on someone else’s time. ;-) Done correctly, I suppose it can be very worthwhile but I doubt that any Microsoft or similar training includes subjects such as how to use a Numbers or Tally table or how to use a CROSS APPLY to build your own unpivot that will blow the UNPIVOT clause out of the water. I’ve also found that many people who have had some decent training fall right back into their old crap-code habits simply because they don’t want to take the time to practice what they’ve learned. A good book and some time with the computer is a great way to learn. Further, for those companies that have a good DBA or two, the best form of training is frequently overlooked… the mentorship and knowledge of the DBA and the “code smells” that (s)he converted to some rigid “rules”. ;-)

  49. Ozzie says:

    #SQLCodeSmells
    Having a UNIQUEIDENTIFIER column as the CLUSTERED PRIMARY KEY of a table.

  50. kgayda says:

    A lot of the so-called smells indicated in this column aren’t necessarily smells at all. Context is everything. For example, it is perfectly acceptable to use NOLOCK hints when querying lookup tables that almost never change. You don’t even want to wait for read locks on these types of tables.

  51. [...] uses of SELECT *.  The obvious question leading on from this is, “Great, what about other code smells ?”  Well, using the language service parser to do that was turning out to be a bit of a hard [...]

Leave a Reply