Producing JSON Documents from SQL Server queries via TSQL

Although SQL Server supports XML well, XML's little cousin JSON gets no love. This is frustrating now that JSON is in so much demand. Maybe, Phil suggests, it is possible to leverage all that XML, and XPath, goodness in SQL Server to produce JSON in a versatile way from SQL Queries? Yes, it so happens that there are plenty of alternatives.

If you need to provide JSON-based results from the database, you are faced with a problem, because SQL Server doesn’t currently have native JSON-integration. You would normally  craft the SQL to do this ‘by hand’ for the specific job, using one of a variety of techniques. However, what if the developers needed to have large variety of results, maybe even from SQL created by the application rather than the database? That’s cool, though there will be a performance hit to using a generic solution that will produce JSON from any SQL. Until SQL Server provides ‘native’ JSON support it will be rather slow, but not difficult.

In this article, I’ll be introducing a few ideas about providing a generic way to produce JSON and other types of  data documents, from SQL.

The simplest strategy to provide a general way to output JSON is to convert your SQL result to XML by using the FOR XML syntax of the SELECT statement.  Then you parse the XML, which has obligingly converted all the values into string form, gives you the names, their position in the hierarchy and even the DataType. When I say ‘simple’ I mean simpler than any other alternative. The advantage of using XML is that you can make use of the versatility of the WITH XML PATH syntax, or the XPath SQL Server extensions, to specify the hierarchy. If you have a generic way to convert  from any XML document, whether derived from SQL or not, to JSON, then it becomes even more useful!

If you’re not too concerned with performance, you can experiment with some functions I’ve published over the years for dealing with data documents. For handling the type of hierarchical information that is transferred via JSON, CSV or XML, I’ve published a variety of stored procedures and functions that all use a rather crude hierarchy table that is sufficient for the purpose. There is already a function that produces a JSON document from such a table. All we have to do is to add a new Table-Valued function that produces a hierarchy from XML. We then just create the SQL we want, get it to create XML and the rest is simple.

Back to AdventureWorks 2008, and here is an example of its use.

Well, that’s it. You just give the function consisting of any working SQL Query, adding the FOR XML PATH, ROOT  to turn it into XML  and you’ll get JSON, reflecting the structures you specify in the FOR XML query. As in much of life, there is some detail to attend to, which we’ll go into later.  Firstly, this is fine for the small stuff, but this requires a lot of string manipulation, and SQL isn’t designed for doing that. If you just want to churn out JSON, we’ll also show you a ‘quick’ version that will dish out  20,000 rows of ‘flattened’ JSON  in around ten to fifteen seconds. The limitations are that you lose any attributes, and you have to use the simple default ‘root’ and ‘row’ structure of the XML, but none of that will matter for the common jobs since you don’t need attributes, or nesting, and if you do simple queries with FOR XML PATH, ROOT you don’t get ’em.  So here we go with an example,

And there you have your JSON ‘document’.

JSON representation of tabular results.

Although there is no ANSI SQL way of representing results from SQL Expressions as JSON, SELECT statements usually produce results that are represented by ‘flattened’  or ‘raw’ JSON lists, like this

I just did the top twenty records just so we could see them in the article, but this should be good  for 20,000 in a batch.

But  this is not the only way of doing it. We can take the opportunity of JSON to return a nested hierarchy of data

Or even using a similar device to CSV

Once they are suitably compressed, the obvious space-savings tend to vanish so the formats are down to personal preference. Also, the ‘flattened’ format can be formatted with indents to make it easier to read. 

Getting your JSON out fast.

Because SQL produces, from the ‘document’ perspective,  flat rows of values, we can cut some corners to get JSON out fast. Firstly, we’ll insist that the root is called ‘root’ and each row is called ‘row’.  All we have to worry about now is to render every row in JSON format, with the right separators and brackets to represent a JSON list of  objects, which consist of name/Value pairs.

Here is a simple version

Which would give you this

Hmm. This seems OK. I’ve tried several methods but this seems to work best. I don’t profess any expertise in SQL XML querying so please let me know if there is a faster way of doing this!

We now wrap it into a function and take care of properly escaping various whitespace characters

You can alter the routine to give you different formats. Here is a bare-bones version that gives you a rather similar document to the old CSV format that I’ve described earlier in the article.

 

More Complex JSON

Sometimes, you don’t want flattened JSON.  Sometimes you have attributes, for example. You might have a more complex hierarchy.  The FOR XML PATH syntax can produce truly startling XML. Just to test our routine out though, we’ll  first take a classic XML example from the JSON documentation.

Which would give you

What we’ve done here is to convert the result into an intermediary SQL Table which is defined by a hierarchy type. Markup languages such as JSON and XML all represent object data as hierarchies.  We then pass it to a separate function, ToJSON, to render it as JSON.

The first trick is to represent it as a Adjacency list hierarchy in a table. This Adjacency list is really the Database equivalent of any of the nested data structures that are used for the interchange of serialized information with the application, and can be used to create XML, CSV, OSX Property lists, Python nested structures or YAML as easily as JSON.

Adjacency list tables have the same structure whatever the data in them. This means that you can define a single Table-Valued Type and pass data structures around between stored procedures. However, they are best held at arms-length from the data, since they are not relational tables, but something more like the dreaded EAV (Entity-Attribute-Value) tables. Converting the data from its Hierarchical table form will be different for each application, but is easy with a CTE.

It is unlikely that we’d have to produce JSON from raw XML as we did in the first example, but you might be faced by a SQL query like this, embedded in this SQL harness

 

… which gives this:

The code to do all this is attached to this article, but it is worth going through the ParseXML function that takes an XML fragment or document and creates a SQL Hierarchy table from it. It can be used to inspect the contents of XML documents, of course, but it  is primarily for interchange and for getting the contents of an XML table when you don’t know the structure up-front well-enough to use XPath, or if, like me, your brain isn’t quite big enough to understand the full expanse of XPath.

Basically, the routine starts by saving the base nodes (the root node if it is there) in a temporary table along with their name and then repeatedly extracting any element and attribute associated with it, and all its nested nodes until it has done them all. It then works out what sort of DataType each one is. It does all this simply by exploring the XML Structure without prior knowledge, using XPath  wildcard syntax, and then using an XPath function ‘local-name()’ to determine the name of each node. It uses ‘text()[1]” to get the value, if any, associated with the node and also finds any attributes associated with the node. It does not make any difference between an attribute or element, since no other document format seems to care. Although I got a lot of inspiration for this routine from XMLTable routine of  the great Jacob Sebastian  (whose book on XSD should be on your bedside table) I didn’t follow the temptation to use a recursive CTE. No sir, iteration is good enough for me, and a lot quicker, it turns out! Jacob was the pioneer, and still wins the elegance prize for  XMLTable, though.

 

Now here is the code for the function that parses XML to fill a hierarchy table

You can, of course, use a routine like this to get XML into a relational format but it is much easier to use the XML VALUE() method  together with the NODE() method to do this sort of work for you.

Which would give:

Conclusions

If you can do so, it is nowadays often much easier for the application developer to deal with XML or JSON than with a SQL Result, particularly if the data required is hierarchical in nature.  If the volume of data isn’t great, then it should be possible to deliver this quickly without undue CPU burden on the server, thereby avoiding a separate conversion process within the application.  We already do this with XML so why not JSON, or any other format  required?

SQL Server isn’t geared up for complex string manipulation, and isn’t therefore an obvious place to engage in this sort of activity. We need native JSON-integration in relational databases just as we already have XML-integration. After all, Sybase and PostgreSQL already have it so why not SQL Server?. Occasionally, the data  as seen from the application perspective just isn’t table-based, and it is much easier, and sometimes quicker,  to deliver what the application requires in a format it can consume.  Whilst we await native JSON-integration, we must do the best we can with the tools that are available

Further Reading

For more articles like this, sign up to the fortnightly Simple-Talk newsletter.

Downloads

Tags: , , , , ,

  • 137458 views

  • Rate
    [Total: 40    Average: 4.7/5]
  • Anonymous

    Complex
    Or use a document database 🙂

  • Anonymous

    Sometimes .NET is better then TSQL
    A .net stored proc running in sql server could convert the XML to JSON a lot better than TSQL can.

    However given the option I would just return the XML to a service written in .net that then transforms it to JSON and passed it on to the application.

    (The connection between .NET DataReader and SQLServer does not sent XML as text, it uses a much more compact format, only converting it to text in the SQLDataReader class)

  • Paulo.Morgado

    Looks like it’s time for SQL to add JSON
    Looks like it’s time for SQL to add JSON as type like it has XML.

  • Robert Keyes

    T-SQL needs a JSON function
    Thanks for the article! Very interesting and useful. JSON is a major player now. Comon T-SQL – time to play some catch up.

  • Phil Factor

    Re: Sometimes .NET is better than TSQL
    Sure. SQL Server is pretty good at providing XML results in a variety of forms from relational data, but needs a fast, reliable way of converting this to JSON. I would go for a clean and simple solution that leverages all that XPath goodness. Doing this in SQL is rather a desperate act. One could use a CLR routine or an external .NET service to do the translation, but for several purposes it is far cleaner, and more portable, to have it built into the SQL Standard! In the meantime, JSON.NET, the best .NET JSON library I reckon, already does a two-way conversion between XML and JSON,

  • Paulo.Morgado

    Re: Sometimes .NET is better than TSQL
    I wouldn’t say converting between XML and JSON.

    Just produce JSON with a FOR JSON clause.

    Of course, having a JSON type would be great, but I don’t think it’s viable to convert automatically. If the developer needs it, she/he has to convert it.

  • Ron Hudson

    Finally, an end to ORMs?
    Phil,

    I have been waiting over 6 years now for someone else to suggest that we don’t always need over-engineered spaghetti ORM code for "business logic." Unfortunately, many of these young Computer Science guys seem hell-bent on keeping us locked in the obsolete OOD world of the 1980’s for business logic (I suspect it’s because those are the languages their professors are most-knowledgeable about). Based on my 20 years of MS/web dev experience, relational platforms like MS SQL are a much-better place to put application and business logic in most-cases. I think that these guys who only know how to implement logic in C# or Java have been holding us back for the last 5 years with their OOD hoohaa. I think that the future of web apps is AJAX to sprocs to JSON..no need for wasting time and money on "pie-in-the-sky" OO middle tiers. Thank you Phil for being the first person I’ve seen acknowledge the feasibility of this approach. I think that Computer Science programs need to include Javascript, AJAX, JSON, and TSQL if their graduates are going to help move the industry forward. The Java and C# OOD focus is holding everything back at this point IMO…maybe it’s time to move on from OOD fantasy designs.

  • piers7

    AAAaarrrrrrrrrghhhh my eyes!
    *cough* CLR TVF *cough*

  • acmarsden

    Brillian…but…
    The schema I have to get to wraps multiple child elements in square brackets

    "distribution": [
    {
    "downloadURL": "https://data.some.org/the/actual/catalog/datasets/1/resources/1.csv",
    "format": "text/csv",
    },
    {

    "downloadURL": "",
    "format": "",
    }
    ],

    My output doesn’t have that – have I missed something?

    Many thanks

  • sudpank

    Possible Bug in FlattenedJSON & ToJSON scripts
    Hi Phil,

    Thanks for the great article on JSON from T-SQL. I ran your query for the example that you have provided just to test it. Your FlattenedJSON can’t process more than 250 rows and ToJSON can’t handle more than 133 records. If you reduce the number of columns to 2, it can only produce 770 JASON records. Any idea, what might be causing this. I ran it on SQL Server 2008 R2 (SP2) Express Edition. Where is the limitation coming from?

  • Phil Factor

    @Sudpank
    Re: Possible Bug in FlattenedJSON & ToJSON scripts
    I’ll take a look. Something seems to have crawled in because I tested it to more rows than that!

  • Phil Factor

    @Sudpank
    Hmm. I’ve just retested it and it works fine up to 600 rows (it takes a long time to I didn’t test it for more). This is a bit of a head-scratcher. The only think I can think of is that you’ve forgotten that SSMS only displays a limited length of string of 8092 characters. If this is the case, you can export the data to a flat file which will not be truncated. To do this:

    Right-click the Database and then click Tasks -> Export Data
    Select your Data Source and choose "Flat File Destination" for the Destination type. Then, choose a file name for the output.
    On the menu item "Specify Table Copy or Query", choose "Write a query to specify the data to transfer". Then paste in the query
    The rest should be pretty obvious. This will output the file to text and you can open it in notepad or whatever text editor you choose.

    When I do this sort of work I use a stored procedure of my own to save JSON files to disk that I’ve already published on the site, called spSaveTextToFile. which you can get here https://www.simple-talk.com/sql/t-sql-programming/the-tsql-of-text-files/

  • jocapc

    Native JSON implementation
    Note that SQL Server 2016 will have native functions/clauses for parsing JSON (OPENJSON) and producing JSON (FOR JSON)

  • cgierlack

    Escaping double quotes
    Hi Phil,

    I noticed in the JSONEscaped function it is replacing " with ". For my data coming through I may have text with double quotes in it and when I try to escape it, this removes my escape and if I leave it alone, the REST service accepting will throw an error (rightfully so since the json is then malformed). I was going to remove that and allow " to be escaped but wanted to see if there is a particular reason why or if there is another way to escape double quotes that I’m missing.