Tony Davis

Simple-Talk Editor
News, views and good brews

TSQL and the Tower of Babel

Published Tuesday, January 08, 2008 2:11 PM

It is always a bit of a strain to program in several languages at once. A simple task like writing a database-driven website will involve you in at least three, maybe more. The experience of diving into JavaScript, VB.NET, and TSQL, all at once, is one that most human brains aren't, in evolutionary terms, designed for.

 

Every year or so, someone gets the brainwave of developing a universal language which gets around the awfulness of having to learn, and use, so many. Either these people do not know the history of IT (PL/1 is a good starting point), or they think that 'this time it will be different'. It never is.

 

There are alternative approaches to this. By far the worst is to make languages 'converge' so that, for example, a comment block, Switch statement, or 'for loop' is the same in all languages. As languages get more similar, the brain gets more confused. All those little clues that remind the dozing consciousness what language he's in are taken away, and one starts making mistakes. VB and C# are converging noticeably. There is even a joke going around VB programmers that the way to convert code from C# is to comment out the semicolons. It is getting to be almost true.

 

An alternative strategy is to make languages diverge strongly enough that it is obvious, and that one can mentally switch from one to another without conscious effort. I never had a moments problem, in times past, in switching from C to assembler, to SQL. I'm happy with switching between JavaScript and TSQL. On the other hand, switching between JScript, JavaScript and Java can make the head spin.

 

SQL Server is now its own 'Tower of Babel'. It is noticeable that TSQL has been added to by Microsoft programmers with a variety of sympathies. The original Sybase variety had a reasonable consistency, but this has been compromised by interference from the procedural and object-oriented cultures that followed. Additions to SQL Server all seem to reflect the prevailing technological culture of the time. Now, when we are faced with XPath and SQL in the same environment, it is all starting to look a bit messy.

 

There probably is a process, or decision process, within Microsoft for ensuring the consistency of the syntax of TSQL, as it develops in SQL Server, but wouldn't it be a wonderful thing to make it into an open standard so we can all use a restraining hand when the Microsoft Hotshots start dreaming up more extensions to the language in a variety of syntax conventions?

What are your views on how TSQL (and SQL Server in general) has evolved over recent years? Has it become bloated and confusing, or simply more powerful (or both)? And what's the solution? Just add your comments to my blog, and all entries will go into a prize draw to win a Simple-Talk gift bag!

Cheers,

Tony.

Comments

 

GSquared said:

I've always found it annoying that the string functions in T-SQL were obviously built by separate developers, and they didn't coordinate with each other.  For example, replace takes the parent string first, then the substring to replace, and what to replace it with; charindex takes the substring first, the parent string second, and where to start looking last.  It took me a long time to memorize which one had which sequence.

Then there are CTEs starting with "with", which is the same term used to add hints to tables, etc., so CTEs have to have a semicolon before them if they are in the middle of a script or proc.  Why not use the abbreviation "CTE" instead of the word "WITH", and eliminate the extra contortions (plus, it would be easier to understand).

So, in terms of ensuring consistency in syntax, yeah, T-SQL could do with some work.

On the other hand, I like that new features and functions are being added.  I like CTEs more than loops for things like hierarchies in SQL 2005.  There are some Katmai features I will like, like DATE and TIME data types, and (I don't remember what they're called) the feature where you can list sets of values ("select (x, y), (a,b), (c,d)" instead of "select x,y union all a,b union all c,d".

I guess I like new features, new capabilities, but I wish they'd keep a bit more consistency on syntax, naming and how things work.
January 8, 2008 3:12 PM
 

datadump said:

One of the areas really overdue for reform is the naming convention for stored procedures, or, more properly, the lack thereof. The system stored procedures use a completely random variety of wordssmushedtogether and words_using_underscores or even mixtures like sp_purge_jobhistory. Is it sp_helpobject or sp_help_object? Who can say. Worse there are the occasional ones like sp_bindefault where a "duplicate" letter at the join has been elided. Then there's the one common (almost) convention which is the universally denouced bad practice of putting the verb before the object - as in GetObject DelObject rather than ObjectGet ObjectDel. At least all the proc names are lower case... except for the ones that aren't (wait till you meet the joys of the case sensitive sql instance!). Then there are the parameters. Is it @db_id or @dbid? @job_name, @jobname or just @name?

But how can we reform this mess without breaking backwards compatibility? I'd say in an analogous way to how 2005 solved the 2000 system tables issue with backwards compatibility views. What we need is an object that is to a stored procedure or function what a view is to a table. With such an alias object we can rename all the stored procedures and their parameters to a consistent and effective naming convention. Even better we can enhance the security model by using these functional aliases to control which parameters users can set and which are pre-set (or calculated) in the object. You know it makes sense.
January 9, 2008 4:37 AM
 

Jack the DBA said:

Certainly SQL Server has become a more powerful platform, but I think at the expense of being able to be a expert in SQL Server.  Now you are a Data Engine expert or a report Server expert, etc...  Certainly there are ways to make it better without making it more complex.   The new data types coming in Katmai are nice and the performance DMV's in 2005 ar eeven better, but Ser vice Broker, Notification Services and the CLR have made it much more complex.
January 9, 2008 6:58 AM
 

Judes said:

I don't like having to reinvent the wheel, when Microsoft makes changes and it causes some of my automated DBA tasks to no longer be valid the consequence is more work.  Now on the other hand I like added functionality and moving forward.
Legacy applications tend to be the most cumbersome when the changes are in effect, so much so that moving forward can be quite delayed.
So, my  answer is sort of both, more confusing in the arena of SSIS versus DTS, because it is very different, and yet more powerful.
January 9, 2008 7:42 AM
 

SkyBeaver said:

Tony, your comments about the evolution of programming languages are right on the mark.  The C# language extensions required to support LINQ, in particular,  have really increased the surface area of the language, to the point where it now seems cluttered and over-engineered.

I've worked with SQL Server since the early Sybase days, and in fact spent several happy years working for Sybase Consulting.  On the whole, I feel that Microsoft has made the product easier to use and more accessible, although I agree with a prior post that it often appears that different engineering teams have each added their part to the platform, thus giving it an inconsistent feel.

The original Sybase Transact-SQL platform was written by Bob Epstein, Stu Schuster, and maybe three oher developers, which explain its consistency.  The 6.0 and 7.0 releases left most of that platform intact.  It's the XML and CLR support where the divergence occurred and the clutter began.

Finally, a lot of the clutter can be explained by a single word:  Oracle.  Oracle has always been chock full of features that almost no one uses.  Message queueing built directly into the database engine, for example.  As the list of Oracle features has grown over the years, the SQL Server team has felt compelled to match many of the features regardless of efficacy.
January 9, 2008 8:24 AM
 

paschott said:

I'd have to agree overall - there are a lot of extras in T-SQL that don't really seem to belong.  I think that the check we're looking for is the ANSI board, isn't it?  Of course, adding functionality seems to be the way the game works across the board.  Even Joe Celko seems to like the inclusion of some non-ANSI functionality in SQL Syntax.  Personally, I wouldn't mind seeing a "SUM" function for text.  We have to build custom code in order to concatenate notes from a variety of fields and one of the other DB platforms has this.

As for the consistency, I think that would be great to address first - common naming conventions for new procedures, perhaps slowly work in the new names and parameter names.  Parameter names could even be optional parameter names at first that take the value of the original parameter (I didn't say this was great, but it could work).  Slowly migrate away from that and it could work.

On the whole, I don't have a lot of complaints on the evolution of T-SQL.  There are a lot of pieces I don't use just because I don't know about them or don't need them on a daily basis.  That can be a little painful.  I don't think there needs to be a standards board to hold them too much in check because we'd never get new functionality that way.  :-)  I would push for some more consistency in names and parameters and perhaps even some inline help to get an overview of something without needing to open BOL, but that's not essential to the way the product works.

I definitely don't want to try to make T-SQL do everything.  Unified programming languages just aren't fun to work with.  I can shift gears between languages without too much trouble and each language seems to have its flow.  Trying to unify them would break that flow or make the language really unwieldy.
January 9, 2008 11:51 AM
 

acbups said:

Ahh, I'm mindful of the continual battle of standards, even within an organization!  

Suppose you have a rule that columns may not be named with reserved words.

How do you avoid using [filename] when you need to store the name of a file, and [file_name] is also reserved as the name of a function?  [fname]?  Then you're left to wonder if it's a filename or a first name.  Similarly, misspellings like [filname], [filenam] and [filenm] leave you and other developers cursing as you try to remember how to incorrectly spell it to get what you want.

And what happens when Katmai comes along with its DATE and TIME types, while you sit atop an established application dependent on tables containing columns named [date] or [time]?

Maybe those who coded their own XML solutions with SQL Server 2000 can tell us how they handled their applications when [xml] became a reserved word with SQL Server 2005...
January 10, 2008 3:19 PM
 

Adam Machanic said:

A few comments:

First of all, I don't agree that dissimilar languages are easier to flip between.  I used to code VB, JS, and T-SQL, and found that I would jump into JS and automatically start writing BEGIN and END blocks, etc.  It was a pain.  Life is much easier for me now, doing almost 100% T-SQL with a bit of C# thrown in for good measure :)

To GSquared: "WITH" for a CTE is part of the ANSI standard, and I think that despite the fact that SQL Server previously used "WITH" for something else, it's important to stick to the standard.  DB2 and PostgreSQL -- probably among others -- have supported standard CTEs for some time.  If SQL Server supported CTEs using a different keyword it would just make it that much harder for apps to be ported in, and new developers to learn the lingo...

To SkyBeaver: I don't know anything about Oracle's "message queuing built directly into the database engine" that "almost no one uses," but I for one am quite pleased with SQL Server's Service Broker feature.  I've already used it on several projects and have more in the queue (so to speak <g>)... I've found it an incredibly useful and powerful addition to the tool set.

That said, I hope that going forward Microsoft will take a break from adding nonstandard features and instead continue down the path of getting closer to the ANSI Standard.  We're about to see MERGE and DATE/TIME -- both standards-driven features -- and in 2005 we saw several enhancements based on the standards.  That's a move in the right direction, in my opinion, but MS has a lot of work left to do.  

Many people have been begging for years for "DOMAIN" support, and what do we finally get?  CLR UDTs, and deprecation of T-SQL's "rules."  Talk about missing the mark!  MS will also keep its slightly half-assed OVER clause implementation, at least for another few years, despite numerous people rallying for improvements in Katmai.  And then there are the row constructors that we expected to see in Katmai -- also removed from the feature set.  Meanwhile, FILESTREAM and EDM show up... Is SQL Server destined to become a file server?  An application server?

T-SQL and SQL Server in general have certainly become much more powerful in the past couple of versions, but I would actually say that I'd like to see the core part of the product -- i.e., T-SQL -- get MORE bloated, with standards-compliant features...  
January 11, 2008 12:14 PM
 

GSquared said:

Adam:  My comment about "with" applies equally to all versions of SQL that use it, not just T-SQL.  I'd rather have seen the standard be "CTE" or something else more meaningful, rather than "with".
January 11, 2008 12:44 PM
 

Jonathon Storm said:

In regard to the inconsistencies in the T-SQL language and naming standards inside the SQL engine, I think Microsoft has failed miserably.  They have not implemented common linguistic standardization (or insufficiently implemented it).  But they could start at any time!  They should hire a great linguist (or two) and give them authority to do this work, and keep them employed permanently.

How should it be done?  Just like linguists do!  I offer as an example the modern Hebrew language, which I've been learning (along with Biblical Hebrew) for some years.

1. Hebrew was a living language for a few thousand years (during Israel's national existence before 130 C.E.), then it was a liturgical/religious/sacred language for nearly 2,000 years while there was no Jewish state and no every-day common use of the language.

2. A single man, Eliezer Ben-Yehuda (http://en.wikipedia.org/wiki/Modern_Hebrew_language#Modern_Hebrew), almost by himself reworked it to be usable for common use in the newly-reborn modern state of Israel.  In the process he did extensive linguistic research to update and modify old terms so they could be used in the modern world.  For example, he named the bicycle (unknown in Biblical times) "of'naim", the dual form of "ofen", or wheel (http://he.wikipedia.org/wiki/%D7%90%D7%95%D7%A4%D7%A0%D7%99%D7%99%D7%9D).  

3. After 1948, Israel has maintained a small branch of the government that attempts to guide and standardize the development and evolution of the language.  As such, it publishes standards, teaches, keeps a repository, et cetera.  It is funded and given enough (mostly) authority to do its work.  Because of the seminal work they do, there is a standard that is available to use, and thus in the unruly world of everyday Hebrew there is something to hold on to.

Microsoft should learn from history and from linguists, and establish permanently-funded and authorized offices of language standards (for the programming languages Microsoft controls) and naming conventions (for the products they sell).  They have the money, they have the authority, and because they are not using them, we live in a crazier, more frustrating world than we need to.  I hope they will think less like cloistered programmers and more like professional information scientists, librarians, and linguists.  They have made some strides in this regard, but they should be leading the industry.
January 14, 2008 7:06 PM
 

Adam Machanic said:

Jonathon:

Wonderful idea, except that Microsoft does not control much of the language.  The majority (perhaps not the vast majority, but at least > 50%) of T-SQL adheres to the SQL Standard, which is controlled by the ANSI X3H2 Committee.  I believe that Microsoft has a seat or two on the committee, but that's about it.  Making half of T-SQL conform to some kind of spoken/written language standard developed by linguists and the other half conform to the ANSI Standard would result in a very strange hybrid...
January 17, 2008 6:57 PM
 

BrianTowers said:

Different programming languages support different programming paradigms. The best multi-paradigm language by a long way is C++, however in the vast majority of cases it is not the best language to use because ease-of-use combined with support for a smaller subset of paradigms is more important.

SQL is the best programming language for set-based programming. I think it makes little sense for it to try and tack on other programming paradigms. Other languages are much better for prodecural, object-oriented, template-based programming and I don't see the point of bloating SQL by trying to add-on / improve these aspects of SQL. If you need to do this kind of stuff then do it in a C#, VB, Java, C++ program or DLL rather than bastardize SQL.

Often the database server is the main bottleneck for processing in a system. Non set-based processing should best be performed elsewhere where the environment is better optimized for that. Set-based processing should continue to be done in the database engine, via SQL which is optimized for that purpose. IMHO CLR and XML in SQL is a bad thing. The occasions where they are lifesavers are far outweighed by the times when they are abused, resulting in code which is hard to test, understand and maintain. Of course if you want job security then they provide an excellent means to that ;-).

On the subject of MS replacing the ANSI standards body - no, that's a very bad idea and one which nobody else in the industry would entertain.
January 23, 2008 7:33 AM
You need to sign in to comment on this blog

















<January 2008>
SuMoTuWeThFrSa
303112345
6789101112
13141516171819
20212223242526
272829303112
3456789
Go With the Flow
 Knowing enough about the routes that messages take is vital to being an effective Exchange admin,... Read more...

When Email Collaboration Could Have Changed History
 In our mission to make history relevant to the busy IT executive, we speculate how Email might have... Read more...

Bunnikins!
 When an IT manager is selected as a victim of office politics of a large corporate, it is time for him... Read more...

Exchange Database Technologies
 One of the most misunderstood technologies in Exchange Server, regardless of its version, is the... Read more...

Top Tips for Exchange Admins
 Michael Francis hands out imaginary Olympic medals to the winner of the August 'Top Tips for Exchange... Read more...