Av rating:
Total votes: 32
Total comments: 26


Phil Factor
The Simple-Talk SQL Prettifier
22 May 2007

Transform your SQL, C# and VB code with the Code Prettifier

The Simple-Talk SQL Prettifier was first launched in July 06 in order to provide a means to prepare T-SQL code for HTML-based publication. I was driven to write it by my frustration at having to paste dull monochrome, badly-formatted SQL code into blogs and forums.

The Prettifier proved very handy and has been used ever since for formatting the code in Simple-Talk blog and forum posts, as well as articles in the Workbench series.

Since the launch of v1 of the Prettifier, commercial code beautifiers have emerged that are able to output HTML in both <code> and <pre> blocks in a convenient way. Also, commercial refactoring and documentation tools are now available that do what the prettifier does in a much more extensive, robust and scalable way. Nevertheless, new requirements kept cropping up – such as prettifying code specifically for wikis, and prettifying C# and VB as well as SQL code – that meant that, with a few tweaks and additions, the prettifier continued to earn its keep.

Therefore, in this article I proudly present the all-new Simple-Talk Code Prettifier. In the following sections I describe what the tool does, provide an overview of its key features, and then briefly explain how it works. As always, however, actions speak louder than words, so I strongly encourage you to try it out for yourself. The prettifier is available to use, freely and without registration, on the Simple-Talk website:

Test out the Simple-Talk Code Prettifier

Take a piece of monochrome, poorly formatted SQL (or C# or VB) code, paste it in to "Source Code" tab of the prettifier, set your options on the left hand panel and click Prettify!

The full source code is freely available from the Code Download link at the top of this article.

What the prettifier does

I've always liked the colour-based syntax highlighting of editors such as Query Analyzer and Management Studio. In fact, I stare at this sort of colour-coded SQL so often and for so long that the monochrome rendition looks all wrong. I can understand intellectually that strings don't have to be in red and comments in green but somehow it makes me feel unsettled when they aren't. I can read and understand code far quicker if it is colour-coded. I'm sure this is true for a lot of other SQL developers.

However, if you just take the code from the Query Analyser and paste it into a forum, blog or wiki, you will be disappointed with the result. Something nice like this…

INSERT INTO @results (item)
   SELECT SUBSTRING(@string,
                     s1.number+@LenDelimiter,
                     MIN(s2.number)-s1.number-@LenDelimiter)
     FROM    (SELECT [string]= @String )f
   CROSS JOIN numbers s1
   INNER JOIN numbers s2 ON s1.number< s2.number
   WHERE s1.number<=LEN (@string)
       AND  s2.number<=LEN (@string)
       AND SUBSTRING(@string,s1.number,@LenDelimiter)=@Delimiter
       AND SUBSTRING(@string,s2.number,@LenDelimiter)=@Delimiter
   GROUP BY s1.number

…will probably end up looking more like this:

INSERT INTO @results (item)
  SELECT SUBSTRING(@string,
 s1.number+@LenDelimiter,
MIN(s2.number)-s1.number-@LenDelimiter)
    FROM    (SELECT [string]= @String )f
  CROSS JOIN numbers s1  
  INNER JOIN numbers s2 ON s1.number< s2.number
  WHERE s1.number<=LEN (@string)
AND  s2.number<=LEN (@string)
 AND SUBSTRING(@string,s1.number,@LenDelimiter)=@Delimiter
AND SUBSTRING(@string,s2.number,@LenDelimiter)=@Delimiter
  GROUP BY s1.number

You are slightly better off with SSMS, which will at least cut and paste code in colour, but it still requires some work to get it into respectable shape for pasting onto a web page, such as a blog entry or a forum.

The prettifier is designed, primarily, to resolve these issues. It takes SQL, C# or VB code and prepares it for rendering correctly in a browser by doing the colour highlighting, selecting the correct font, doing the indenting with non-breaking spaces, and changing all the characters that aren't standard into HTML entities.

It will allow you to submit code to a blog, forum, email or article, and have it displayed in proper color, and with the correct indentation.

Prettifier features

A brief run down of the main features and some of the issues we had to tackle.

Prettifying and optional formatting of SQL

Preserving the correct formatting and color-coding of SQL is one issue, and the main one that the prettifier was intended to tackle. However, occasionally, one also needs to prettify poorly formatted, monochrome code in order to make it more readable. The prettifier will indent code consistently, put in line breaks, and capitalise keywords.

NOTE:
The Prettifier does not do a proper syntactical analysis – it only tokenises. Therefore, the prettifier should never be used to change the actual code to make it easier to work on. For that purpose you should use a proper refactoring tool such as SQL Refactor.

We discovered that tab-stops in the Query Analyser were set to eight spaces, whereas they were set to four in SSMS. Joe Celco recommends three in his great book. This meant that we had to parameterise this so as to make code prepared in SSMS render correctly.

Another problem we didn't forsee is that the <pre> and <code> blocks are rendered in very different sizes in different browsers. I used to dislike specifying font sizes for reasons of accessibility for partially sighted, but, so far, the only answer I've found is to nail down the size of font, although I have made it configurable.

C# and VB support

As well as SQL Server articles, Simple-Talk also publishes articles in VB and C#, and the editor wanted them nicely formatted too. I added both as Languages that the Prettifier supported. After all, the whole thing is table-driven, so adding a new language is pretty trivial. I'd like to move to loading the table as an XML file, but to keep SQL 2000 compatibility, it stays as a table. As soon as we get a good article with Perl or Jscript in it, I'll add them too.

Wiki support

Next, someone wanted to paste code into Wikis, and discovered that they didn't render properly, so we needed a third set of rules for tags, along with Blog-style (which didn't allow SPAN tags) and Article-style (the FONT tag is deprecated in XHTML)

Exporting database routines to HTML

I use the prettifier occasionally to write out the routines from within databases as HTML. This is one of the key uses I had in mind when originally developing the tool, I'm not sure whether it is a good idea now that SQL Doc does it for you automatically; in particularly as it does all database objects including views and tables, gets the extended properties, and gives you a whole lot of other information. However, if you are a die-hard DBA who wants to dump out all our project source-code as HTML then it might as well be as painless as possible.

The first step was to complete and correct the facility to prettify a stored procedure or function by name.

spPrettify @RoutineName='uftSplitString'

This also saved a lot of time for us when editing submitted articles as we could test the code and then 'set it' in HTML form much quicker.

The next step is to use this 'prettify-by-name' feature to write out all your routines as HTML files. The code to do the writing out of the code is implemented in the spWriteOutAllRoutines procedure. I've stripped it down to bare bones, and it is easy to tailor it to your taste and preferences.

How the prettifier works

The prettifier is implemented as a single T-SQL stored procedure. I have to admit that the idea of attempting this sort of task in a stored procedure rather than a more suitable language such as C# was due to a certain perversity on my part. I enjoyed the challenge.

I wanted the prettifier to work in SQL Server 2000 as well as SQL Server 2005. This caused me a few problems. I quickly realised that the input had to be in a text variable, and the output too would potentially be far larger than the puny varchar(8000) that is available in SQL 2000. Working with text variables is never easy, and here there was likely to be a lot of text processing. As few string functions work with text variables, this required some nimble work passing the work piece-by-piece into varchars for pummelling.

As far as the output was concerned, a text variable was entirely impossible to implement. In the end I decided to output the results as a table, splitting the output into a number of rows with a column of varchar(8000).

The reasons why it is a bad idea to do this sort of intensive string parsing work in a stored procedure, requiring a great deal of iteration, are immediately obvious. If one gives the stored procedure a large amount of SQL Source, it runs very slowly, but for what I wanted the code for, that didn't matter; I console myself that it is still a great deal quicker and easier than doing it by hand. My original intention was that this stored procedure should become a component in a system that allowed a database to write out an HTML representation of its source-code via an overnight batch so speed wasn't the first priority.

The stored procedure is a long one, but this is almost entirely due to the fact that it is writing to two table variables. One table holds all the tokens to be recognised, and the other holds the HTML entities that have to be substituted for the non-ASCII and reserved characters so that these characters are properly rendered on the browser. Of course, it would probably make sense to have the contents of these tables in a permanent table but we are encouraging you to try the routine and so it makes installation easier if we do it this way.

The actual logic is pretty simple. The routine runs as a 'state machine', in that each iteration of the parser operates on the text according to the result of the previous iteration. It chops off white-space and examines the start of the string by finding the longest token in the token table that gets a match with the start of the string. It then saves the token, if it finds one, along with its type. It changes state accordingly. If it has found a string, it must then find the end of it; if it has found an @ variable, it must get the rest of the variable, and so on. This sort of design looks clumsy, but it is very easy to modify and debug.

Once it has divided up the source code into its elements, it then re-assembles the original source. When it gets to white space, it de-tabbifies it and changes the spaces into 'hard spaces' (or non-breaking spaces) that HTML will leave as-is. It inserts either <span> blocks or the archaic (deprecated) <font> blocks for those cases where the forum or blog software removes <span> blocks. I usually try the span approach first and, if it doesn't work, use the font version.

Because it was reasonably simple to do, I added code into the white-space processing to allow a certain amount of reformatting. Compared with SQL Refactor, it hardly starts on the job, but it does just enough to make the results reasonably readable. The reformatting is again table-driven to allow new languages to be added.

What I do and don't use it for

As I stated earlier, I don't really intend the prettfier to be used as a code reformatting tool. If you want one of those I can unashamedly recommend SQL Refactor, as I am one of its greatest admirers, and use it the whole time. I just use SQL Refactor as a SQL Beautifier and Lint. At this, it excels. It is a joy to be shown what variables are no longer used, and a pleasure to have the code in the Query window instantly changed to make it readable. Every reformatting operation can be adjusted to personal preferences. It always takes me aback to meet someone who doesn't use it as a productivity tool for developing databases. It is an essential.

I do, however, use the Prettifier a great deal, both for formatting the code in articles, and for writing out the routines from within databases as HTML. Most of all, I always have it handy whilst browsing blogs and forums just in case I decide to put in a reply and want the code I paste in to be easily readable.



This article has been viewed 21931 times.
Phil Factor

Author profile: Phil Factor

Phil Factor (real name withheld to protect the guilty), aka Database Mole, has 25 years of experience with database-intensive applications. Despite having once been shouted at by a furious Bill Gates at an exhibition in the early 1980s, he has remained resolutely anonymous throughout his career. See also :

Search for other articles by Phil Factor

Rate this article:   Avg rating: from a total of 32 votes.


Poor

OK

Good

Great

Must read
 
Have Your Say
Do you have an opinion on this article? Then add your comment below:
You must be logged in to post to this forum

Click here to log in.


Subject: SELECT * FROM a
Posted by: Neil Davidson (view profile)
Posted on: Friday, July 14, 2006 at 3:32 PM
Message: When prettified, the 'a' vanishes.

Subject: Pretty Cool
Posted by: AspiringGeek (view profile)
Posted on: Saturday, July 15, 2006 at 1:26 PM
Message: You've remedied a long-standing pain in the patooticles for me. I'm looking forward to the RTM version. Thanks.

Subject: Select * from a
Posted by: Phil Factor (view profile)
Posted on: Saturday, July 15, 2006 at 2:43 PM
Message: Ouch. Neil has found a bug.
The line which reads
if @Pos>len(@remains) break
should be
if len(@remains)<1 break
A new version is winging its way to Simple-Talk for updating. In the meantime, "quote" single-letter variables if they're at the end of the chunk of SQL!
I dread to think what Helen will find


Subject: Re: Pretty Cool
Posted by: Phil Factor (view profile)
Posted on: Saturday, July 15, 2006 at 2:51 PM
Message: I'd consider an RTM version if I knew what it was. It will paste into Word if you select the 'Rendered HTML' tag, and select what you want. Then paste into Word or wherever else you wish to display the code and it will usually paste in colour. If you save the Word document as RTF, then you are home and dry.

Subject: Essential!
Posted by: ThomasP (view profile)
Posted on: Sunday, July 16, 2006 at 4:45 AM
Message: Having looked at the source, I'd say it could be used as a generic code prettifier. All the SQL-specific stuff is in a table. only the @xxx variable state seems to be SQL-specific. What about doing a VB and C# version? I notice that your Simple-talk contributors are showing boring monochrome code-fragments. I can't see why one can't have one stored procedure that just gulps in the keywords from a static table into the table variable. Then it would prettify anything you wanted.

Subject: Uppercase SQL
Posted by: Anonymous (not signed in)
Posted on: Monday, July 17, 2006 at 4:16 AM
Message: Why oh why do SQL commands have to be uppercased. They are usually color coded, Isn't that sufficient?

Subject: Uppercase SQL
Posted by: Anonymous (not signed in)
Posted on: Monday, July 17, 2006 at 4:16 AM
Message: Why oh why do SQL commands have to be uppercased. They are usually color coded, Isn't that sufficient?

Subject: Good job
Posted by: Anonymous (not signed in)
Posted on: Monday, July 17, 2006 at 5:22 AM
Message: Hi,
We have just updated our free online formatter which is mentioned in your article, It's support
code-colored feature now, you may have a try here:
http://www.wangz.net/gsqlparser/sqlpp/sqlformat.htm

Regards
--James

Subject: RTM
Posted by: Anonymous (not signed in)
Posted on: Monday, July 17, 2006 at 6:52 AM
Message: is "Released to Manufacturing", I think - the production version...

Subject: Good tool
Posted by: Anonymous (not signed in)
Posted on: Monday, July 17, 2006 at 1:12 PM
Message: It's a good tool
I use Notepad2 to view my frquently used SQLs, and it does a pretty good job too.

Subject: RTM, Good Job and Uppercase
Posted by: Phil Factor (view profile)
Posted on: Monday, July 17, 2006 at 1:28 PM
Message: I'll happily do a switch that inhibits the UPPERCASE for SQL commands (reserved Words). You ask the reason why I did it. This is because it is the style recommended by Joe Celco in his most excellent book 'SQL Programming style' on p26 (Uppercase the Reserved Words). He recommends Capitalising object names and having scalars, parameters and variables in lower case. I don't currently do that because I like using capitals in my code to indicate concatenation of words, as in 'spWhatTheHeckDoesThisDo'. I'm not instinctively a 'switch' person but I think we'll have to do get into the business of adding knobs to twiddle anyway, to accommodate 'Doco's request for a switch to inhibit CRs at the end of line (some BLOGs turn them into <br>s

WangZ's SQL print formatter is an excellent product which is highly recommended. It must be the first SQL beautifier that really works. I wouldn't have started work on the prettifier had the WangZ print-formatter supported colour at the time.

According to Wikipedia, RTM means one of
. Read the manual, as a variant of RTFM.
. Release to manufacturing, a synonym of Gone gold.
. The initials and user name of Robert Tappan Morris, Jr., creator of the Morris worm.
. Resin Transfer Moulding, Manufacturing technology for composites.
. Radio Televisyen Malaysia, a Malaysian state-owned television network.
. Radio Television Marocaine, the Moroccan state-owned television network.
. Return To Manufacturer (of a broken, defective or unwanted product).
. Rising Talent Manga, an underground manga website.
. Requirements Traceability Management, as related to project requirements.

I'm sorry that I misunderstood. (Redgate employs the son of the person who introduced the acronym RTFM into IT jargon in the mid 1970s via the CP/M Users Group. I thought you ought to know.)

Subject: tabs -vs- spaces
Posted by: Bill_Scott (view profile)
Posted on: Tuesday, September 05, 2006 at 10:32 AM
Message: I'm pathologically alergic to tabs in code. can you add a switch to convert tabs to spaces as part of the setup?

Subject: Re: tabs -vs- spaces
Posted by: Phil Factor (view profile)
Posted on: Wednesday, September 06, 2006 at 11:28 AM
Message:

Bill

All tabs get converted to 'hard' non-breaking spaces as part of the de-tabbifying process. Well, that's what is supposed to happen. Tabs cannot be represented properly in HTML. If I didn't do the 'De-tabbification' then the whole prettification wouldn't work, so I can't really switch off the process. Is that what you meant? 

It might be rather fun to do 'Tabbify' 'De-Tabbify' procedure just to take tabs out of stored procedures (but not convert the results to HTML). We've already had a request for something that just does the alteration of case in keywords.

Phil


Subject: SUGGESTION
Posted by: dances with fascists (view profile)
Posted on: Tuesday, December 05, 2006 at 3:41 AM
Message: It is hinted in the text that this code won't run in SQL 2000 but it isn't explicitly stated as such so I had to download the code and try it to find out. How about prominently telling people "Requires SQL Server 2005"?

Now I'll have to go try it on another machine somewhere, looks real interesting.

Has anyone got any ColdFusion code to replace the ASP though?


Subject: Re: Suggestion
Posted by: Phil Factor (view profile)
Posted on: Tuesday, December 05, 2006 at 3:56 PM
Message: It works fine on my SQL 2000 machine here. Can you be more specific on what is going wrong and I'll se if I can fix it. It is definitely supposed to work on either SQL 2000 and SQL 2005!

p.s. I'm hoping to do a release version soon so any other suggestions would be welcome

Subject: regexp and .net
Posted by: Anonymous (not signed in)
Posted on: Wednesday, December 27, 2006 at 8:40 PM
Message: I applaud your initiative at building a state-machine in TSQL for a programming challenge for yourself. But RegExp is itself a state-machine, and with native .Net libraries, quite easy to call now.

Ahh, but I'll grant you .Net code is not as simple to create and run as is Sproc code. So I guess I'll revert to my initial "Good job" sentiment :)

Subject: re: missing results
Posted by: Phil Factor (view profile)
Posted on: Saturday, January 06, 2007 at 2:46 AM
Message: Yes. The aim was to have something that could be run on SQL 2000 as well, so that DBAs using either version had a tool they could use without the need for third-party software; for instance they could create intranet sites with HTML versions of the scripts for all their database objects. Obviously, you'd need to use one of the SPs I put in my recent DMO article to script the objects out.
Actually, I use the Prettifier most for posting code to forums.

Subject: Prettifier crashes on this code
Posted by: tracymckibben (view profile)
Posted on: Tuesday, February 06, 2007 at 10:50 AM
Message: Try Prettifying this:

CREATE TABLE Table1 (
Field1 VARCHAR(10)
)

CREATE TABLE Table2 (
Field1 VARCHAR(10)
)

Both sites, Simple-Talk and wangz.net, appear to crash.

Subject: re:Prettifier crashes on this code
Posted by: Phil Factor (view profile)
Posted on: Wednesday, February 07, 2007 at 2:27 AM
Message:

Odd. I ran it and it worked fine

CREATE TABLE Table1 (

Field1 VARCHAR(10)
)

CREATE TABLE Table2 (
Field1 VARCHAR(10)
)

There is a bug on the site in that, when the parser finds something it can't cope with, it passes back the error, but it is not caught by the frontend (something got lost in translation from PHP to ASP.net) We must fix that. I was wondering if whatever caused the problem got filtered out when you pasted the code into the comment.

Can you please run it from the stored procedure and see what happens?


Subject: bringing things up to date
Posted by: Phil Factor (view profile)
Posted on: Tuesday, May 22, 2007 at 6:51 PM
Message: I hope I've added all the improvements suggested above. I took up ThomasP's suggestion to ad VB and C# support and thought I'd just add a bit of indenting to keep up with the excellent http://www.wangz.net/gsqlparser/sqlpp/sqlformat.htm. I've scrapped the .Net web page in favour of the old PHP one. It, hopefully, shows when there is something it has to prettify that it doesn't like or understand.
Keep the suggestions rolling in, remembering that it is really just a tool for rendering code snippets nicely in HTML.

Subject: there are certain compromises...
Posted by: Phil Factor (view profile)
Posted on: Thursday, May 31, 2007 at 6:33 AM
Message: Please remember that this is a stored procedure and will take a long time to render a huge mass of code. It's been tested on some pretty big stored procedures but the web interface will time-out well before it has finished. I usually break up code into sections to process it. If you use the stored procedure and leave it to work its way through, it will obligingly work its way through the code but it will give you time for a sip or two of coffee whilst you wait

Subject: Bug in C# Version
Posted by: Darko (not signed in)
Posted on: Friday, June 01, 2007 at 5:15 AM
Message: Hi guys,

I've just tried pasting some of my curren code and the C# pretifier breaks on the syntax:
if ( ! string.IsNullOrEmpty(filter) )

and also on:
Outlook.MAPIFolder conferenceRoomFolder = oRootFolder.Folders[conferenceRooms[i]];

besides that all works fine ;) and it does look great!!!

Subject: Bug in C# Version
Posted by: Darko (not signed in)
Posted on: Friday, June 01, 2007 at 5:38 AM
Message: Hi guys,

I've just tried pasting some of my curren code and the C# pretifier breaks on the syntax:
if ( ! string.IsNullOrEmpty(filter) )

and also on:
Outlook.MAPIFolder conferenceRoomFolder = oRootFolder.Folders[conferenceRooms[i]];

besides that all works fine ;) and it does look great!!!

Subject: Re: Bug in C# Version
Posted by: Phil Factor (view profile)
Posted on: Tuesday, June 05, 2007 at 9:02 AM
Message: Hopefully, a fix should be in place by next week! sorry about that, more missing symbols from the list I was given!

Subject: SQL Formatter
Posted by: GuidoMarcel (not signed in)
Posted on: Monday, June 18, 2007 at 4:13 AM
Message: Here is another SQL Formatter:
http://www.sqlinform.com

Subject: sql
Posted by: Sabri (view profile)
Posted on: Friday, June 27, 2008 at 3:35 AM
Message: I don't know start sql beginning!!
raed all book but un.
pae geif me way one be one.
thank your

 










Phil Factor
Finding Stuff in SQL Server Database DDL
 You'd have thought that nothing would be easier than using SQL Server Management Studio (SSMS) for searching... Read more...



 View the blog
Implementing User-Defined Hierarchies in SQL Server Analysis Services
 To be able to drill into multidimensional cube data at several levels, you must implement all of the... Read more...

Using the Filtering API with the SQL Comparison SDK
 Red Gate's SQL Comparison SDK provides a means to compare and synchronize database schemas and data... Read more...

SQL Toolbelt 2008: Predominantly an Engineering Task
 The conversion of the Red-Gate tools to be compatible with SQL Server 2008 might not seem, on first... Read more...

SQL Response: The dim sum interview
 Richard Morris met David and Nigel of the SQL Response team, in a dim sum Restaurant in Cambridge. They... Read more...

Writing Efficient SQL: Set-Based Speed Phreakery
 Phil Factor's SQL Speed Phreak challenge is an event where coders battle to produce the fastest code to... Read more...

Beginning SQL Server 2005 Reporting Services Part 1
 Steve Joubert begins an in-depth tour of SQL Server 2005 Reporting Services with a step-by-step guide... Read more...

Ten Common Database Design Mistakes
 Database design and implementation is the cornerstone of any data centric project (read 99.9% of... Read more...

Beginning SQL Server 2005 Reporting Services Part 2
 Continuing his in-depth tour of SQL Server 2005 Reporting Services, Steve Joubert demonstrates the most... Read more...

SQL Server Full Text Search Language Features
 SQL Full-text Search (SQL FTS) is an optional component of SQL Server 7 and later, which allows fast... Read more...

Creating CSV Files Using BCP and Stored Procedures
 Nigel Rivett demonstrates some core techniques for extracting SQL Server data into CSV files, focussing... Read more...

Over 150,000 Microsoft professionals subscribe to the Simple-Talk technical journal. Join today, it's fast, simple, free and secure.

Join Simple Talk