Click here to monitor SSC

Tony Davis is an Editor with Redgate Software, based in Cambridge (UK), specializing in databases, and especially SQL Server. He edits articles and writes editorials for both the and websites and newsletters, with a combined audience of over 1.5 million subscribers. You can sample his short-form writing at either his blog or his author page. As the editor behind most of the SQL Server books published by Red Gate, he spends much of his time helping others express what they know about SQL Server. He is also the lead author of the book, SQL Server Transaction Log Management. In his spare time, he enjoys running, football, contemporary fiction and real ale.

Taming the SQL Beast

Published 15 August 2013 3:37 pm

The recent articles 10 Common Mistakes Java Developers make when Writing SQL and the follow-up, Ten More Mistakes… highlight some common crimes against SQL and offer sound advice. These mistakes aren’t restricted to Java programmers or Oracle. .NET programmers make them with SQL Server (see, for example, Plamen Ratchev’s Ten Common SQL Programming Mistakes from 2009). Why are they so persistent?

Some of this can be blamed on the classic OO – procedural divide. When intelligent developers find that standard and accepted techniques from their main language perform poorly in a database, then both SQL and the database itself get a reputation for being antiquated, quirky (functions don’t work properly in SQL Server, joins are slow, and so on). A few of these gripes are justified, but most are pure myth.

Perhaps it’s fear. Even for well-informed and experienced developers, a relational database can appear to be an unsettling creature of random savagery that is best kept at arm’s length, or locked, growling, behind an ORM. Development isn’t easy at any time. Phil Factor, in a recent editorial, used the analogy of lion taming for describing the developer’s craft.

Should database professionals shoulder some of the blame? Grant Fritchey, in an article featured in this newsletter, suggests that DBAs and database professionals, by being too eager to control and dominate on database matters, are ‘doing it wrong’ in the service they provide and aren’t effective in assisting development teams to gain insights into the best ways of using relational databases.

These insights can be sudden and far-reaching. In his chapter for our forthcoming Tribal SQL book (due in time for PASS, watch this space!), Mark Rasmussen recounts his early misadventures with SQL Server. He explains how and why he decided to split his statistical data across hundreds of databases to help SQL Server “cope with the data load”. He describes his moment of clarity, sitting in a SQL conference session, finally understanding how b-trees worked, and realizing that he’d been trying to replicate the very idea behind a clustered index, just in a horribly inefficient way. Mark is now a developer who knows more about how SQL Server works internally than most DBAs, and speaks regularly on the topic.

Maybe the way out of the repetitive cycle of database mistakes is to find better ways of helping developers to understand the basic nature of how SQL and the database storage structures really work? If you’ve experienced similar “moments of clarity” with regard to how to work effectively with SQL and SQL Server, please share them.

9 Responses to “Taming the SQL Beast”

  1. Keith Rowley says:

    There is a myth in developing software that full specialization and complete division of responsibility is the best way to develop software. Unfortunately in the real world this simply isn’t true. A really good developer needs a basic understanding of a lot of things beyond just how to write good code. They need to understand the operating system the code will run on, at least the fundamentals of the database it will use, and a half dozen other things besides.

  2. Robert Young says:

    – Why are they so persistent?

    In a nutshell: application development started as code silos, and coders have been able to maintain that fiction, despite Dr. Codd’s discoveries. The reduction in cost and increase in efficiency that Organic Normal Form brings to application development is, according to a Chinese meme (which may be concocted by Westerners, for all I know), “breaking the rice bowl”.

    Managements of application development come either from coders or MBAs, neither of which has much math chops. “We’ve always built applications with code on the client, going back to COBOL days, and we see no reason to change. Besides, if I replace 100 coders with 3 database developers, my status in the org goes in the dust bin, and I’m likely to find myself redundant. Better keep on hiring more coders; only with du jour languages.”

    If the only tool you have, everything tends to look like a nail. Coders accuse database folk of that in a knee jerk fashion. They protest a bit too much, since they’ve been the obstructionists for 3 decades (at least).

    Finally, at the individual coder level (and I’ve heard these words with my own three ears), “SQL screwed up my class, because it’s too hard. I’ll just dump all the data and process it in java.”

    Looked at semantically, the RM (and some real world RDBMS) is really OO: the data and its functions happily living in one cottage. Coders tend not to see it that way, largely because they don’t see DRI as ‘function’, but it is. And T/SQL, et al aren’t real languages, by their lights.

    Back before we had multi-processor/core/SSD/X00 gig Xeon machines, basically when machines had DASD as somewhat faster tape that didn’t need a rewind, then the RBAR approach made sense due to the physical constraints. We’ve made it much farther down the Yellow Brick Road, and the silo coders need a spell behind the woodshed.

    It’s not us; it’s them. Stop licking their boots.

  3. jkeefe says:

    Perhaps as DBA’s we need to be a bit more realistic (and dare I say honest?) about the core problems faced by developers in “our domain”. Much of the fear a trepidation felt by the developer who must deal with our carefully walled-off databases comes from ourselves. We nurture their fear and cast aspersions on their competence with our pontifications regarding the “appropriate” or “correct” or “acceptable” means of accessing and our holy-grail data. We look down with loathing on any who would dare to question our divine right to claim lordship over the data.

    Perhaps it’s time we take a “caretaker” approach to the data in oyur carefully crafted repositories? Perhaps, with some training for ourselves, we can learn to train others in the art of data access and manipulation. Perhaps, and this is probably a really tall order, we can rise above our own self interestes and help all inolved in the enterprise to accomplish beyond their invividual abilities and work as a team.

    Then again, we might just continue down the road we have laid for ourselves being careful to maintain the high walls and cleverly placed traps to catch the unwary. Even though we do so at our opwn peril.

  4. Robert Young says:

    Perhaps we should all re-read Codd’s 12/13 rules and ask the simple question: how can those rules be enforced by arbitrary client code? And, if the rules are unenforced, is the resulting data lump an RDBMS in any meaningful sense of the word?

  5. Sergio E. says:

    My last epifany was a couple of years ago, while I was working on a .Net website, I realize the critical importance of use sql parameters in querys and the benefits it has to the security.

  6. paschott says:

    Way back when I was just starting out, I did everything one row at a time. I slowly figured out sets, but even for larger items, I still thought in single rows. That meant that I used a _lot_ of cursors. As I grew, I learned the power of working in sets and moved on. As noted earlier, when you have a hammer, everything looks like a nail.

    Today – still growing and trying out new technologies. I’ve also learned that working with the developer team to come up with the most efficient combination of code and database is far better in the long run. Help them understand why doing something one row at a time is bad in some scenarios. Learn why they use a specific ORM and figure out what’s good/bad about that. I realize not everything needs to be a stored proc and they realize that having a well-designed database will perform better in the long run than whatever they let the ORM build for them. :)

    It doesn’t hurt to explain the reasons why I make the decisions I do around the data model and I’m willing to work with them to come up with a solution that works for everyone involved. That also leads to better teamwork and a better relationship between the DB and Dev sides.

  7. willliebago says:

    A developer asked me this week how to optimize a query between 3 tables where table A was a parent to 2 tables, B and C. He asked if it was better to join A to B to C or to join A to B and A to C. I then explained that the 2 queries may not even bring back the same results and that he should follow the primary key and foreign key relationships to create joins. I think he had a moment of clarity :)

  8. DThrasher says:

    I can think of two reasons why ORM mistakes get repeated again and again.

    The first is that relational technology has largely been settled in most computer scientist’s minds, which means it doesn’t get much attention on colleges and universities, while OO still has some interesting academic problems left. As a result, most schools neglect this foundational technology.

    Second, after leaving school, most programmers pick up their skills and experience from their coworkers. In most businesses, the database team and the programming team are in separate camps, hindering knowledge transfer between the two groups.

    So, with gaps in both formal and informal knowledge sharing, it’s no wonder these problems crop up again and again. Especially since the object-relational impedance mismatch remains one of the trickiest areas to manage in your system architecture.

  9. DThrasher says:

    I just received the $50 Amazon gift card for this week’s comment. Much appreciated, guys!

    Here’s a follow-up question for a later column, Tony: What can be done to increase the visibility of relational technology on school campuses? How can we help our coworkers grok the differences between OO / procedural programming techniques and set-based approaches?

Leave a Reply

Blog archive