SQL Code Reuse: teaching a dog new tricksPublished 1 September 2011 5:42 pm
Developers, by every natural instinct and training, strive to make their code reusable and generic. Dissuading them from doing so, in certain circumstances, is akin to trying to persuade a dog not to fetch a stick. However, when Gail Shaw commented on Twitter last week that "For the record, code reuse in SQL.is not always a good thing", it was more than a causal remark; it was borne out of bitter experience.
The problem is that, in the absence of their usual armory O-O techniques such as encapsulation and inheritance, the price of making database code easier to maintain, by such obvious methods, can be high. The "generic" views, stored procedures and functions that result, may seem elegant and reusable, but can destroy performance, because it is tough for the query optimizer to produce an efficient execution plan. It hurts to make SQL code generic.
At some point, nearly every SQL Programmer gets infected with the feverish idea of passing table names to stored procedures. "Hey, why write scores of procedures to do this process on each table when I can write a generic, reusable procedure that does it on any table!" Bad idea; behind every stored procedure is an execution plan and a stored procedure designed to work with "any table" will result in a generic execution plan that will perform very poorly for a majority of tables. It is far better if they are tailored for specific tables and specific needs.
Another typical example is where the logic for some seemingly-complex calculation has been "abstracted" into a monstrous, but reusable, view, which performs tortuous aggregations and multiple joins, executes appalling slowly, acquires numerous long-held locks and causes severe blocking in the database. Often, such twisted logic can be replaced by simple, easily optimized SQL statements. Granted, it isn’t "reusable" and flaunts the ‘DRY’ (Don’t repeat yourself) principle, but it is relatively easy to write and will often perform orders of magnitude faster.
User-defined Functions (UDFs) are another favorite mechanism for promoting code reuse, and are often even more problematic. In-line logic is always much faster, even if to the sensitive developer it has the look of hippos doing line-dancing. Memories of the overuse of UDFs can make any seasoned DBA flinch. If you ever bump into Grant Fritchey at a community event, buy him a beer and ask him about the case of the application with multi-statement UDFs that called other multi-statement UDFs in an attempt at enforcing inheritance in a database. Also ask him how well it scaled beyond a single-user and a single row.
Should SQL Server simply get better at adopting and supporting such basic and universally-accepted programming practices as putting logic in a function? Probably, yes, but in the meantime, we must measure any code reuse in the database against the likely performance penalty.
Perhaps the most effective form of code reuse is via constraints, though it requires lateral thinking to extend this beyond simple data rules. Functions can be used, but extra care and effort is required to write them as inline functions; in-line code or calculated columns will always outperform UDFs. Stored procedure use is to be actively encourage; just don’t try to make them generic.
On Simple-Talk we’ve published a lot about execution plans, query optimization and performance. We believe that, once a developer is aware of the process, they are better able to judge that fine balancing point in the compromise between performance and maintainability. Even better, we hope we’ve also given a glimpse of an alternative path to those goals, by means of intelligent database design. A neat trick, if you can do it.