Click here to monitor SSC
Phil Factor

SQL Server CREATE TABLE syntax diagrams

06 June 2013

Many of us have seen, on MSDN, the heading 'Syntax', followed by a rash of nested brackets and keywords, enough to put off the most resolute of code-cutters. However, there is a goldmine of information there, and Phil had an ambition to get at it, and share the gold. The result is this article, full of railroad diagrams

Most of the time, we can create tables perfectly adequately using our preferred GUI tool. However, for complex tables, a GUI can't always provide what we want, or even if it does, it is often quicker and easier to use code.

The "Backus-Naur Form (BNF)" diagrams on MSDN, the format of which I'll review briefly towards the end of the article, explain every syntax detail, but in a form that is very hard to digest and keep in the head, well in my head anyway. Often, I find myself reverse engineering the syntax from the examples.

The aim of this article is to provide a smooth route through the syntax, by means of railroad-style diagrams. I developed these diagrams, initially, while creating a SQL-based build-script generator and trying to learn some of the more esoteric aspects of the SQL Server 2012  CREATE TABLE syntax. However, I soon became convinced that others might find them useful. I found that the details "stuck" far more quickly when learning from a diagram and through their use, I uncovered several interesting facts about how SQL Server 2012  CREATE TABLE works that had eluded me previously.

Diagrams, not words

To create a table in SQL Server, we just need to use the CREATE TABLE command, give the table's name and a list of the columns we need. We can also specify where the table is to be stored and provide some table options, including constraints.

To describe this in more detail would take too many words, so I'm going to use diagrams instead. These are railroad, or syntax, diagrams and you encounter them occasionally in SQL. SQLite, for example has them for its documentation, and so has Oracle.

With a syntax diagram, there is only one entry point and one exit point. We start at the entry point and read from left to right. We can take any of the possible paths or railway lines to the exit, but we can only travel in one direction along each path.

I've changed tradition slightly by running down the page as well as from left to right, but elsewhere, I've adhered to the following conventions:

  • Commands and other keywords appear in UPPERCASE inside rectangles, and we type them literally, exactly as shown in the rectangles.
  • Parameters appear in lowercase inside ovals. We generally substitute our own information, such as names of tables, columns, numbers or expressions  for the parameters . If parameters are within angle-brackets they represent an entire diagram.
  • Punctuation, operators, delimiters, and terminators appear inside circles.

Where the syntax gets a bit dense, I've taken the liberty of putting the punctuation with the associated symbol rather than separating out the punctuation in its own circle. I've only done this where space is a bit tight and the usage is unambiguous.

Using a Syntax Diagram

Once you get the hang of a railroad diagram, they are a glorious way of getting to understand syntax quickly. Start at the top, the terminus, and travel along the railway system. You can only go one way. As with any railway map, there are certain sections, akin ot the suburban lines around and through major cities, which require an "exploded view" to show the details.

At such points, the parameter in the oval appears in angle brackets (<>). This signals that, for detail, we need to change temporarily to a more detailed map. When a syntax diagram aims to be so complete that someone can use it to engineer a parser, it can get quite intricate. Fortunately, this isn't the case with our diagrams.

The SQL Server 2012 CREATE TABLE Syntax Diagrams

Evey diagram that follows is done in reduced definition so it fits on the page. Click it to see it a decent size, or use the PDF file to view it or print it out in a good working size. The PDF files are in the speechbubble at the head of the article.

Although there is only one high-level diagram to consider for creating a table, we may require five other detail diagrams at certain parts of our journey. I provide each diagram as a separate PDF file, as part of the code download with this article (see the speech bubble to the right of the article title). They are readable when printed out in A4 but I like to print them as A3 wall-charts.

The annotations that you'll see on each diagram are quotes taken from MSDN, to whom all credit must go. The words in the MSDN CREATE TABLE page are all smithed with great care. Whereas I've spotted errors in the syntax BNF on the page, the words have always been spot on, and packed with information.

Let’s take a short example of use, taken from the create_table_overview diagram.

The entry point is CREATE TABLE. From here, we can qualify the table name with the database name and schema name, each separated by a full stop (MyDB.MySchema.MyTable) or we can just qualify with the schema, or we can simply enter the table name.

After an opening bracket, we hit the <column_definition> junction, where if required we can jump to the Column Definition syntax diagram for details of how to define the required columns.

A comma separates one column definition from another, or from a computed column definition, or a column constraint definition, and so on. In this fashion, we work our way through the railroad network, in the directions of the arrows, and making a choice at every junction.

CREATE TABLE overview syntax

The Column Definition syntax

Column Constraint syntax

Computed Column Definition

Table Options

Table Constraint

Index Options

Using the Diagrams

I find that I learn a diagram more rapidly than a raw syntax expression. If you are similar, then you soon find that you can dispense with most of the diagrams except for some of the more obscure table creation operations.

When I read through these diagrams, I get the occasional surprise, or reminder of an odd fact I’ve forgotten.  For example, the computed_column_definition diagram shows that one can use a computed column, if persisted, to be a reference to a foreign key.  One doesn’t even have to specify the column in the foreign table if it is the primary key. IDENTITY fields only need to specify start and increment if they vary from the default.

 The MSDN Syntax descriptions

The syntax descriptions provided on MSDN are like Extended Backus–Naur Form (EBNF) but have some conventions from BNF. This is a very simple language.

Angled brackets <> around a symbol, such as the data type example shown below, shows that it is non-terminus and needs a separate definition.

  • Optional items are enclosed in square brackets
  • Groups of expressions are enclosed in curly braces
  • A '|' is a binary operator meaning 'or'.

This definition of a data type…

<data type> ::=

[ type_schema_name . ] type_name

    [ ( precision [ , scale ] | max |

        [ { CONTENT | DOCUMENT } ] xml_schema_collection ) ]

… says that we are defining a data_type symbol(<symbol> ::=). This can start with the schema of the data type (if it is user-defined!), followed by the obligatory type_name. After that, we have a bracket, followed by one of these:

  • A specification of precision ,  and maybe scale (if it is a numeric data type)
  • A MAX qualifier if it is a VARCHAR, NVARCHAR  or VARBINARY  data types
  • A specification for an XML schema collection, only to associate an XML type with a schema collection

Finally, we have a closing bracket.

If you see an "n", it means that the preceding symbol or group can be repeated. For example, [,...n] means that there can be any number of repeats, each separated by a comma.

Summary

The raw BNF of the MSDN documentation is a difficult and error-prone way of trying to understand the syntax of the CREATE TABE statement, or any SQL for that matter. We already know from other database systems that a form of railroad diagrams are a useful aid for understanding how SQL statements work, so why not have them for SQL Server TSQL?

 I did these to improve my own understanding and very much hope that you’ll find them as useful as I did. I realize that, for most purposes, a well-designed GUI is the easiest way to do a job on a Windows system but just occasionally,  code is just plain easier: and to get up and running, nothing beats a clear syntax diagram.

Acknowledgements

Many thanks to the pioneers of the railroad diagram. Also the MSDN CREATE TABLE  page from where the explanatory quotes within the diagrams were taken and adapted.

Phil Factor

Author profile:

Phil Factor (real name withheld to protect the guilty), aka Database Mole, has 30 years of experience with database-intensive applications. Despite having once been shouted at by a furious Bill Gates at an exhibition in the early 1980s, he has remained resolutely anonymous throughout his career. See also :

Google + To translate this article...

Search for other articles by Phil Factor

Rate this article:   Avg rating: from a total of 62 votes.


Poor

OK

Good

Great

Must read
Have Your Say
Do you have an opinion on this article? Then add your comment below:
You must be logged in to post to this forum

Click here to log in.


Subject: Brilliant!
Posted by: SAinCA (view profile)
Posted on: Thursday, June 06, 2013 at 12:15 PM
Message: Thanks, so much, Phil! These are SO much easier to follow and understand, and thanks for the clear annotations that are so educational. Would that MS would adopt this style. I totally agree that the EBNF diagrams, even after a decade of working with them, are non-intuitive and often a roadblock instead of a facilitator.

MS should commission you to extend railroad diagrams to all T-SQL constructs! The benefits to the community will be HUGE.

Cheers!

Subject: Great!
Posted by: macxima (view profile)
Posted on: Friday, June 07, 2013 at 12:26 AM
Message: Hi Phil, I'm working with SQL last 8 years but did not get so deeply understand as you described. Really It's great work.

Cheers,
Mukesh

Subject: Genesis
Posted by: Robert young (view profile)
Posted on: Friday, June 07, 2013 at 8:02 AM
Message: For as long as anyone knows, IBM manuals (user, engineer, support, ...) have been founded on railroad diagrams. They had a generic name within IBM, but it escapes me for the moment. Backus, after all, was an IBMer.

Subject: Nitpicking
Posted by: testuser (not signed in)
Posted on: Saturday, June 08, 2013 at 2:51 PM
Message: The diagrams are great, but I really wish that you had used a different font for the text annotations.

By the way, which software did you use to draw the diagrams?

Subject: Re nitpicking
Posted by: Phil Factor (view profile)
Posted on: Saturday, June 08, 2013 at 3:43 PM
Message: Smartdraw. I wouldn't say it is ideal for this sort of work but it does the job. Some people, such as Doug Crockford, used Visio. The font is my own handwriting. I had the crazy idea, at one time, of doing some of the annotations by hand so I turned my own handwriting into a TTF so I could do both and they'd blend. . Eventually, when i realised how many syntax diagrams there were, i gave up the idea, but I thought ' hell, at least it isn't comic sans' and kept it. I tried changing it to something like verdona, but it just looked kinda wrong.

Subject: DB2 Diagrams
Posted by: veritas (view profile)
Posted on: Sunday, June 09, 2013 at 8:36 AM
Message: I have to agree with Genesis here, although I'm fairly sure I recall when IBM first abandoned the various squiggly brackets and introduced them. Here's a link to CREATE INDEX, scroll to the bottom:
http://publib.boulder.ibm.com/infocenter/db2luw/v8/index.jsp?topic=/com.ibm.db2.udb.doc/admin/r0000927.htm
Note that they only use standard characters to create this one, although the paper manuals use solid lines and arrowheads. Far better once you try them.

Subject: Deja vu
Posted by: Frank Quintana (not signed in)
Posted on: Monday, June 10, 2013 at 12:09 PM
Message: In 1998 I bought a book titled “SQL Programmer Reference” by Kishore Bhamidipati, published by Osborne/McGraw-Hill ISBN)-)7-882460-5
I highly recommend this book.
Mr. Bhamidipati divided his book in three parts
I-SQL Commands
II-Syntax Flowcharts
III- Keywords and Parameters
The Part II Syntax Flowchart is a display of railroad diagrams to depict DDL, DML and DCL statements.
Mr. Bhamidipati was a Senior Technical Analyst at Oracle at that time.

Subject: precision, scale
Posted by: Tim (not signed in)
Posted on: Monday, June 10, 2013 at 5:47 PM
Message: It looks like you've missed a comma between precision and scale in the column definition railroad diagram.

Subject: Optional vs Required
Posted by: Topcat (not signed in)
Posted on: Tuesday, June 11, 2013 at 1:42 AM
Message: This is great Phil, the world runs on images now more than on words. May I suggest one improvement and that is to indicate somehow whether a clause is required or optional. For example common sense and experiment tells me I can't create a table with only a table constraint and no columns, but the railroad diagram allows this - unless I'm not understanding correctly...

Subject: Re: Optional vs Required
Posted by: veritas (view profile)
Posted on: Tuesday, June 11, 2013 at 3:02 AM
Message: Looks like a slight error in the structure. The theory is that the required components are on the 'main line', in this case left, top to bottom. This is illustrated in defining the table name where the database and/or schema can be ommited. I've seen a convention where defaults are in BOLD. DB2 diagrams run left-right with keywords in CAPS and default values above the main line.

Subject: Re: Optional vs Required
Posted by: veritas (view profile)
Posted on: Tuesday, June 11, 2013 at 3:02 AM
Message: Looks like a slight error in the structure. The theory is that the required components are on the 'main line', in this case left, top to bottom. This is illustrated in defining the table name where the database and/or schema can be ommited. I've seen a convention where defaults are in BOLD. DB2 diagrams run left-right with keywords in CAPS and default values above the main line.

Subject: Re: suggested improvements
Posted by: Phil Factor (view profile)
Posted on: Tuesday, June 11, 2013 at 3:04 AM
Message: @Tim Now fixed.
@Topcat Yes, This is a good idea. I suspect that we'll hit this problem quite often. At the moment I can't think how to show this clearly. There has to be a column, unless it is a filetable, but it can be preceded by a computed column. I have another complaint, which is that there seems no way of showing what the default setting is if you don't make it explicit. e.g. That IDENTITY has a seed of 1 and increment of 1, or that the default setting of IGNORE_DUP_KEY is 'off'

Subject: Syntax Diagrams
Posted by: Ray (not signed in)
Posted on: Tuesday, June 11, 2013 at 12:16 PM
Message: Thanks for the brain bump Phil. I have not seen the "railroad track" approach in 30+ years. The original "Pascal Report" by Niklaus Wirth, July 1973 includes the full language specification in the this format. I think that was one of the reasons I found Pascal syntax generally easy to master.
WikiPedia has a nice article which includes a link to the original report in pdf. Scroll to the appendix and you will see the pages that filled many of my grad school nights back in the dawn of time :)
http://en.wikipedia.org/wiki/Pascal_(programming_language)
http://www.standardpascal.com/The_Programming_Language_Pascal_1973.pdf

Subject: Railroad Diagrams are under-appreciated!
Posted by: roundand (view profile)
Posted on: Wednesday, June 12, 2013 at 7:56 AM
Message: And it's great to be endorsed by Phil Factor himself! Like you I sometimes write them to ensure I understand syntax constraints.
They can be used to illustrate any more-or-less linear state machine. Some years back I documented a development tool we wrote with various APIs for integrating with external systems, each API consisting of many function calls, so used railroad diagrams to document the legal sequence of function calls in each API.

Subject: Optional vs Required
Posted by: Colin Daley (not signed in)
Posted on: Wednesday, June 12, 2013 at 2:25 PM
Message: @Topcat Well, the Books Online grammar also says that you can create a table with a table constraint but no columns. To fix the problem Phil would have to start there, because he is simply representing that grammar visually.

However, you must remember that it is not always possible to state *everything* in a context-free grammar, the classical example being the "dangling else"[1].

When such anomalies are found, the clarification can be put in text annotations ("the else belongs to the innermost if"). Default values can go there too.

[1]http://en.wikipedia.org/wiki/Dangling_else

Subject: Great way to display syntax
Posted by: JonathanRoberts (view profile)
Posted on: Monday, July 01, 2013 at 5:46 AM
Message: I first used these diagrams when I started programming in Pascal (over 25 years ago) from Kathleen Jensen and Niklaus Wirth "Pascal User Manual and Report" which was published in 1970. I have always thought them a great way to display syntax.

Subject: Mixed Feelings
Posted by: Jeff Moden (view profile)
Posted on: Monday, July 01, 2013 at 7:57 AM
Message: While railroad diagrams are nice to look at and can certainly be easier to read than code with a million optional braces, they cannot be copied and pasted into SSMS for the simple stuff. For example, the syntax layout for BULK INSERT is easy to copy and paste and then you simply delete what you don't want to use.

I agree with most that railroad diagrams can make it very simple to understand syntactical requirements but that should never be a reason not to provide copy-able code examples.

 

Phil Factor
Searching for Strings in SQL Server Databases

Sometimes, you just want to do a search in a SQL Server database as if you were using a search engine like Google.... Read more...

 View the blog

Top Rated

SQL Server XML Questions You Were Too Shy To Ask
 Sometimes, XML seems a bewildering convention that offers solutions to problems that the average... Read more...

Continuous Delivery and the Database
 Continuous Delivery is fairly generally understood to be an effective way of tackling the problems of... Read more...

The SQL Server Sqlio Utility
 If, before deployment, you need to push the limits of your disk subsystem in order to determine whether... Read more...

The PoSh DBA - Reading and Filtering Errors
 DBAs regularly need to keep an eye on the error logs of all their SQL Servers, and the event logs of... Read more...

MySQL Compare: The Manual That Time Forgot, Part 1
 Although SQL Compare, for SQL Server, is one of Red Gate's best-known products, there are also 'sister'... Read more...

Most Viewed

Beginning SQL Server 2005 Reporting Services Part 1
 Steve Joubert begins an in-depth tour of SQL Server 2005 Reporting Services with a step-by-step guide... Read more...

Ten Common Database Design Mistakes
 If database design is done right, then the development, deployment and subsequent performance in... Read more...

SQL Server Index Basics
 Given the fundamental importance of indexes in databases, it always comes as a surprise how often the... Read more...

Reading and Writing Files in SQL Server using T-SQL
 SQL Server provides several "standard" techniques by which to read and write to files but, just... Read more...

Concatenating Row Values in Transact-SQL
 It is an interesting problem in Transact SQL, for which there are a number of solutions and... Read more...

Why Join

Over 400,000 Microsoft professionals subscribe to the Simple-Talk technical journal. Join today, it's fast, simple, free and secure.