Robert Chipperfield is a member of the development team at Red Gate Software. In this article, he describes the key features of Red Gate's new database documentation tool, SQL Doc, and provides insight into the development process and some of the design philosophies that underpin the tool.
In a recent article describing Red Gate's SQL Refactor tool, András Belokosztolszki, observed that it is rare for a developer or DBA to work with a database that they also designed. It is much more common that they are required to work on established databases that have been built by a previous developer or a colleague.
What is largely true for developers and administrators is almost by definition true for support staff. One of the first tasks for anyone taking over support duties on a given database is to attempt to gain at least some understanding of its functionality – how it is structured, where different types of data are stored, and how the different entities relate to each other. On a large schema, this can be a formidable challenge.
In the past, I have spent many hours working through schemas and constructing relationship diagrams by hand, then having to update these whenever changes are made to the database later on. Whilst tools such as Management Studio and Query Analyser are able to provide some information about the structure of the database, they are geared more towards working with individual objects at the fine-grained level, rather than gaining a fast overview of the whole database.
This is the gap we hoped to bridge when we developed Red Gate's new automated database documentation tool, SQL Doc. With it, we sought to provide developers and DBAs alike with a means of quickly documenting and understanding the structure of their databases without the need to keep hand-crafted documentation up-to-date.
Self-documenting your database
SQL Doc is designed to make it as simple as possible to generate a comprehensive documentation set for your databases. Of course, the usefulness of this documentation generated will be greatly enhanced if the database itself provides some meaningful indicators of the role and intent of its various component parts.
A large part of designing an intuitive database is the use of consistent and logical naming conventions for you tables (e.g. "CustomerAccounts" instead of "cas") and keys ("pk" prefix for primary keys "fk" for foreign keys, and so on). However, there is a lot more that you can do. SQL Server 2005 introduced the concept of schemas – effectively allowing you to segment your database into different logical groups. In the classic AdventureWorks database, all the HR data is stored in HumanResources schema. This is often seen as a security feature – it allows sections of the database to be quickly locked down rather than having to set permissions on each object individually, not to mention solving the problem of what to do when the person who owns objects in a database leaves an organisation – but it is also very useful as a descriptive feature
One of the most useful things that a developer /DBA can do in helping provide useful documentation is to make use of SQL Server's built in extended properties. These are essentially descriptive attributes that can be added to database objects by the designer. For example:
-- we add the extended property to provide a description to the
-- dbo.Customer.InsertionDate column
'the date at which the row was created',
'user', 'dbo', 'table', 'Customer',
Unfortunately, the standard toolset provides no easy way to work with these extended properties – in Management Studio, viewing the extended properties of a column on a table means drilling down through the object explorer to the column level, and then using the properties dialog to select the "Extended Properties" page. Consequently, extended properties tend to be an underused feature of SQL Server, which is a pity given that the number of objects to which they can be attached has been considerably extended in SQL 2005. It was important to us to include full support for extended properties in SQL Doc.
Developing SQL Doc
Some of the existing documentation tools tend to be little more than documentation editors, with the ability to import from a database schema "tacked on". With SQL Doc, we wanted to achieve something different. First and foremost, it had to be fast and simple to use; in fact, once installed, we wanted to transform the process of generating documentation from being a substantial task to one that could be done in seconds, whenever required.
The project was split into three components – the UI, an "engine", and the output renderer. Keeping the UI separate from the main application logic is fairly standard practice for most large applications, but the second split – between the engine and the renderer – is perhaps less common and more interesting.
Our aim was to ensure that all output format-specific code was entirely separate from the more general code for, say, determining which objects should be documented, and calculating the dependencies between objects. This approach proved its worth when developing version 1.1 off the tool – we received a large number of requests for compiled help (CHM) format output, and could thankfully achieve this moderately easily by changing only the rendering code.
The engine assembly builds on other Red Gate technologies with which you may already be familiar – SQL Compare for providing an abstraction of the database schema, and SQL Dependency Tracker for calculating most of the dependencies. It combines these with further logic for features such as the extended property manipulation.
The renderer itself is modular and extensible, making it easy to support any database object types that may appear in future versions of SQL Server. For example, each type of database object is rendered by a corresponding class, implementing a standard "renderer" interface. Given the type of the database object, an appropriate renderer is retrieved, which takes care of producing the final output for that object. Common page elements are rendered by methods in a further class, ensuring consistent styling of features such as the breadcrumbs and quick links.
Early in development, it became clear that we were seeing severe performance problems when documenting large databases – demonstrated perfectly by our tester who decided to throw a 100,000 schema database at SQL Doc. It sat there overnight, churning away, and then crashed and burned beautifully.
At this point I needed to investigate why things weren't scaling as well as they needed to – time to bring out the profiler. A few bottlenecks were immediately obvious: generating the left-hand navigation tree was expensive, and was being done for every single output page. The original thought behind this was that the relative page links depended on the location of the output file, and so each page would have a different set of links. However, with a little thought, we ended up caching these navigation sections, keyed on the directory of the output file – this meant that once it had been generated for one file in a given folder, all future files would just re-use the cached version.
Working on a few more similar problems for just a couple of days resulted in huge improvements in performance – we ended up being able to profile the same database in less than fifteen minutes.
In the following sections I describe some of the key features of the tool and provide insight into some of the design decisions behind them.
Usability: a familiar object explorer
If you've tried SQL Doc, you'll be familiar with the user interface – an object hierarchy on the left, and a preview on the right. You'll probably also notice that the hierarchy is rather similar to that used in Management Studio.
We have had requests from some people for alternative categorisations of objects, such as grouping by schema, or a Query Analyser style flat view, but we've resisted "improving" it, because it's currently a good match for the world that most of our users work in – that of Management Studio. Sure, it might not be The Best Possible Organisation Ever, but it's familiar.
If a group of people who had never seen or used a computer before were asked to design the best possible input mechanism, given all the technology available to us now, the chances are that they would come up with something rather different to the keyboard and mouse that is currently ubiquitous. Their solution might be a technically better way of doing it, but familiarity counts for a lot. If you don't believe me, try switching between a QWERTY and Dvorak keyboard on a regular basis.
Accessibility: HTML documentation
Our documentation is generated in HTML format. There were many reasons for this, some of which are worth noting here:
- Once generated, the documentation can easily be disseminated. It can be uploaded to a company web server, and instantly shared among a team of people on one project, or viewed from any network connected machine.
- No special viewer is required: if you need to ship a database to a contractor, shipping the documentation alongside it is simple. There is no need for the other party to install any software other than a web browser – not exactly uncommon these days!
- Simple and intuitive navigation throughout the schema: as well as the hierarchical navigation system provided, navigation between related objects is also simple. Foreign keys on table columns link to the tables they refer to, user defined types similarly, and the new SQL Server 2005 CLR-based types instantly show the assemblies they were deployed from.
Everyone nowadays is familiar to some extent with the wonders that can be achieved with technologies such as AJAX and ASP.NET – not to mention the up-and-coming Atlas.
This choice did impose some limitations, some of which are a little irritating – like the tree view being unable to maintain state when navigating between pages on a "no frames" document – but overall, I feel this was definitely the correct choice.
Exploiting extended properties
In SQL Doc, we have integrated Extended Properties – in particular MS_Description – into the core of the documentation, making it easy to add human-readable descriptions to your database. Objects show their description in the page title, and these can be edited with the click of a button while previewing the documentation within SQL Doc. No more need to search Books Online for the correct type0, type1 and type2 parameters just to edit the description for a column!
The team-based development described earlier can be further enhanced with the additional features found in SQL Doc Pro – the command line interface allows documentation to be generated on a schedule, ensuring that developers always see the latest version, with no effort required on their part.
Along with accessibility and usability, we also considered customisability. Whilst the default style tries to be as clean and clear as possible, some companies will have house styles for documentation, and the is especially likely to be of concern when shipping documentation to outside parties.
Our solution here was to use the features made available to us by CSS. This means that by editing a single file, the look and feel of the entire documentation can be quickly, simply, and consistently updated. For example, if you would prefer the list of object dependencies to be a vertical list rather than horizontal, simply change the appropriate CSS style to "display: block;", as shown below:
Using CSS, we were able to provide a high degree customisability of the output style to those requiring it, whilst ensuring that the product remained easy to use.
Database documentation has always been an immensely useful resource to developers and administrators alike, but unfortunately keeping this updated has traditionally been a time-consuming and error-prone process, and as such one that has often been neglected due to time constraints.
Automated documentation tools such as SQL Doc aim to remove this barrier, increasing productivity and reducing administrative burden.