Click here to monitor SSC
  • Av rating:
  • Total votes: 29
  • Total comments: 13
Phil Factor

Producing JSON Documents from SQL Server queries via TSQL

06 May 2014

Although SQL Server supports XML well, XML's little cousin JSON gets no love. This is frustrating now that JSON is in so much demand. Maybe, Phil suggests, it is possible to leverage all that XML, and XPath, goodness in SQL Server to produce JSON in a versatile way from SQL Queries? Yes, it so happens that there are plenty of alternatives.

If you need to provide JSON-based results from the database, you are faced with a problem, because SQL Server doesn’t currently have native JSON-integration. You would normally  craft the SQL to do this ‘by hand’ for the specific job, using one of a variety of techniques. However, what if the developers needed to have large variety of results, maybe even from SQL created by the application rather than the database? That’s cool, though there will be a performance hit to using a generic solution that will produce JSON from any SQL. Until SQL Server provides ‘native’ JSON support it will be rather slow, but not difficult.

In this article, I’ll be introducing a few ideas about providing a generic way to produce JSON and other types of  data documents, from SQL.

The simplest strategy to provide a general way to output JSON is to convert your SQL result to XML by using the FOR XML syntax of the SELECT statement.  Then you parse the XML, which has obligingly converted all the values into string form, gives you the names, their position in the hierarchy and even the DataType. When I say ‘simple’ I mean simpler than any other alternative. The advantage of using XML is that you can make use of the versatility of the WITH XML PATH syntax, or the XPath SQL Server extensions, to specify the hierarchy. If you have a generic way to convert  from any XML document, whether derived from SQL or not, to JSON, then it becomes even more useful!

If you’re not too concerned with performance, you can experiment with some functions I’ve published over the years for dealing with data documents. For handling the type of hierarchical information that is transferred via JSON, CSV or XML, I’ve published a variety of stored procedures and functions that all use a rather crude hierarchy table that is sufficient for the purpose. There is already a function that produces a JSON document from such a table. All we have to do is to add a new Table-Valued function that produces a hierarchy from XML. We then just create the SQL we want, get it to create XML and the rest is simple.

Back to AdventureWorks 2008, and here is an example of its use.

DECLARE @MyHierarchy Hierarchy -- to pass the hierarchy table around

insert into @MyHierarchy SELECT * from dbo.ParseXML(

---your SQL Goes here --->

  (SELECT top 5 title,firstname,middleName,Lastname,suffix,emailAddress,Phone,

       AddressLine1,AddressLine2,City,

        postalcode,SP.Name,SP.CountryRegionCode

       as ContactString

    from AdventureWorks.person.contact c

      inner join AdventureWorks.Sales.Individual  I ON C.ContactID = I.ContactID

      inner join AdventureWorks.Sales.CustomerAddress AS CA ON CA.CustomerID = I.CustomerID

      inner join AdventureWorks.Person.Address AS A ON A.AddressID = CA.AddressID

      inner join AdventureWorks.person.StateProvince SP ON A.StateProvinceID=SP.StateProvinceID

---You add this magic spell, making it XML, and giving a name for the 'list' of rows and the root     

  for XML path ('customer'), root('customers')

-- end of SQL

  )

)

SELECT dbo.ToJSON(@MyHierarchy)

Well, that’s it. You just give the function consisting of any working SQL Query, adding the FOR XML PATH, ROOT  to turn it into XML  and you’ll get JSON, reflecting the structures you specify in the FOR XML query. As in much of life, there is some detail to attend to, which we’ll go into later.  Firstly, this is fine for the small stuff, but this requires a lot of string manipulation, and SQL isn’t designed for doing that. If you just want to churn out JSON, we’ll also show you a ‘quick’ version that will dish out  20,000 rows of ‘flattened’ JSON  in around ten to fifteen seconds. The limitations are that you lose any attributes, and you have to use the simple default ‘root’ and ‘row’ structure of the XML, but none of that will matter for the common jobs since you don’t need attributes, or nesting, and if you do simple queries with FOR XML PATH, ROOT you don’t get ‘em.  So here we go with an example,

SELECT dbo.FlattenedJSON(

  (SELECT top 20 o.SalesOrderID, o.OrderDate, od.ProductID,

       p.Name, od.OrderQty, od.UnitPrice, od.LineTotal

   FROM AdventureWorks.Sales.SalesOrderHeader AS o

     JOIN AdventureWorks.Sales.SalesOrderDetail AS od

       ON o.SalesOrderID = od.SalesOrderID

     JOIN AdventureWorks.Production.Product AS p

       ON od.ProductID = p.ProductID

   WHERE p.Name like 'Road%' 

   FOR XML path, root)

  )

And there you have your JSON ‘document’.

JSON representation of tabular results.

Although there is no ANSI SQL way of representing results from SQL Expressions as JSON, SELECT statements usually produce results that are represented by ‘flattened’  or ‘raw’ JSON lists, like this:-

[

    {"Name":"Road-650 Red, 44","OrderQty":"1","UnitPrice":"419.4589"},

    {"Name":"Road-450 Red, 52","OrderQty":"1","UnitPrice":"874.7940"},

    {"Name":"Road-650 Red, 52","OrderQty":"3","UnitPrice":"419.4589"},

    {"Name":"Road-650 Black, 52","OrderQty":"5","UnitPrice":"419.4589"},

    {"Name":"Road-450 Red, 58","OrderQty":"4","UnitPrice":"874.7940"},

    {"Name":"Road-650 Red, 44","OrderQty":"5","UnitPrice":"419.4589"},

    {"Name":"Road-650 Black, 58","OrderQty":"3","UnitPrice":"419.4589"},

    {"Name":"Road-650 Black, 44","OrderQty":"2","UnitPrice":"419.4589"},

    {"Name":"Road-150 Red, 56","OrderQty":"1","UnitPrice":"2146.9620"},

    {"Name":"Road-450 Red, 44","OrderQty":"1","UnitPrice":"874.7940"},

    {"Name":"Road-650 Red, 48","OrderQty":"3","UnitPrice":"419.4589"},

    {"Name":"Road-450 Red, 52","OrderQty":"6","UnitPrice":"874.7940"},

    {"Name":"Road-150 Red, 62","OrderQty":"1","UnitPrice":"2146.9620"},

    {"Name":"Road-650 Red, 60","OrderQty":"3","UnitPrice":"419.4589"},

    {"Name":"Road-650 Black, 60","OrderQty":"3","UnitPrice":"419.4589"},

    {"Name":"Road-450 Red, 60","OrderQty":"1","UnitPrice":"874.7940"},

    {"Name":"Road-650 Red, 60","OrderQty":"1","UnitPrice":"419.4589"},

    {"Name":"Road-650 Red, 52","OrderQty":"1","UnitPrice":"419.4589"},

    {"Name":"Road-150 Red, 56","OrderQty":"1","UnitPrice":"2146.9620"},

    {"Name":"Road-450 Red, 44","OrderQty":"1","UnitPrice":"874.7940"}

]

I just did the top twenty records just so we could see them in the article, but this should be good  for 20,000 in a batch.

But  this is not the only way of doing it. We can take the opportunity of JSON to return a nested hierarchy of data

   { "employee":

     { "EmployeeID" : 1537,

     "Sale_Order" : [

       { "CustomerID" : 6812 , "Region" : "East Anglia" },

       ...

       { "CustomerID" : 2543 , "Region" : "Wales" }

     ]

     }

   },

   { "employee" :

     { "Employee_ID" : 5723,

     "Sale_Order" : [

       { "Customer_ID" : 234 , "Region" : "London" },

       ...

       { "Customer_ID" : 68125 , "Region" : "Midlands" }

          ]

     }

   }

Or even using a similar device to CSV

{

  "columns":[ "Make", "Model", "Year", "Color", "Mileage"

  ],

  "results": [

      ["Volkswagen","Eurovan","2003","White","56,830"],

      ["Honda","CRV","2009","Black","35,600"]

  ]

}

Once they are suitably compressed, the obvious space-savings tend to vanish so the formats are down to personal preference. Also, the ‘flattened’ format can be formatted with indents to make it easier to read. 

[
   
{
      
"Name":"Road-650 Red, 44",
      
"OrderQty":"1",
      
"UnitPrice":"419.4589"
   
},
  
{
      
"Name":"Road-650 Red, 48",
       
"OrderQty":"3",
       
"UnitPrice":"419.4589"
   
},
   
{
      
"Name":"Road-650 Red, 52",
      
"OrderQty":"1",
       
"UnitPrice":"419.4589"
   
},
   
{
      
"Name":"Road-150 Red, 56",
      
"OrderQty":"1",
       
"UnitPrice":"2146.9620"
   
},
   
{
      
"Name":"Road-450 Red, 44",
       
"OrderQty":"1",
      
"UnitPrice":"874.7940"
   
}
]

Getting your JSON out fast.

Because SQL produces, from the ‘document’ perspective,  flat rows of values, we can cut some corners to get JSON out fast. Firstly, we’ll insist that the root is called ‘root’ and each row is called ‘row’.  All we have to worry about now is to render every row in JSON format, with the right separators and brackets to represent a JSON list of  objects, which consist of name/Value pairs.

Here is a simple version

DECLARE @cars xml;

SET @cars =

'<?xml version="1.0" encoding="UTF-8"?>

<root>

  <row>

    <Make>Volkswagen</Make>

    <Model>Eurovan</Model>

    <Year>2003</Year>

    <Color>White</Color>

  </row>

  <row>

    <Make>Honda</Make>

    <Model>CRV</Model>

    <Year>2009</Year>

    <Color>Black</Color>

    <Mileage>35,600</Mileage>

  </row>

</root>';

 

SELECT Stuff( --we want to snip out the leading comma

  (SELECT TheLine from --this is to glue each row into a string

    (SELECT ',

    {'+ --this is the start of the row, representing the row object in the JSON list

      --the local-name(.) is an eXPath function that gives you the name of the node

      Stuff((SELECT ',"'+coalesce(b.c.value('local-name(.)', 'NVARCHAR(255)'),'')+'":"'+

                    b.c.value('text()[1]','NVARCHAR(MAX)') +'"'

             -- 'text()[1]' gives you the text contained in the node     

             from x.a.nodes('*') b(c) --get the row XML and split it into each node

             for xml path(''),TYPE).value('(./text())[1]','NVARCHAR(MAX)')

        ,1,1,'')+'}'--remove the first comma

   from @Cars.nodes('/root/*') x(a) --get every row

   ) JSON(theLine) --each row

  for xml path(''),TYPE).value('.','NVARCHAR(MAX)' )

,1,1,'')--remove the first leading comma

 Which would give you this

    {"Make":"Volkswagen","Model":"Eurovan","Year":"2003","Color":"White"},

    {"Make":"Honda","Model":"CRV","Year":"2009","Color":"Black","Mileage":"35,600"}

 

Hmm. This seems OK. I’ve tried several methods but this seems to work best. I don’t profess any expertise in SQL XML querying so please let me know if there is a faster way of doing this!

We now wrap it into a function and take care of properly escaping various whitespace characters

IF OBJECT_ID (N'dbo.FlattenedJSON') IS NOT NULL
   DROP FUNCTION dbo.FlattenedJSON
GO
CREATE FUNCTION dbo.FlattenedJSON (@XMLResult XML)
RETURNS NVARCHAR(MAX)
WITH EXECUTE AS CALLER
AS
BEGIN
DECLARE  @JSONVersion NVARCHAR(MAX), @Rowcount INT
SELECT @JSONVersion = '', @rowcount=COUNT(*) FROM @XMLResult.nodes('/root/*') x(a)
SELECT @JSONVersion=@JSONVersion+
STUFF(
  (SELECT TheLine FROM 
    (SELECT ',
    {'+
      STUFF((SELECT ',"'+COALESCE(b.c.value('local-name(.)', 'NVARCHAR(255)'),'')+'":"'+
       REPLACE( --escape tab properly within a value
         REPLACE( --escape return properly
           REPLACE( --linefeed must be escaped
             REPLACE( --backslash too
               REPLACE(COALESCE(b.c.value('text()[1]','NVARCHAR(MAX)'),''),--forwardslash
               '\', '\\'),   
              '/', '\/'),   
          CHAR(10),'\n'),   
         CHAR(13),'\r'),   
       CHAR(09),'\t')   
     +'"'   
     FROM x.a.nodes('*') b(c) 
     FOR XML PATH(''),TYPE).value('(./text())[1]','NVARCHAR(MAX)'),1,1,'')+'}'
   FROM @XMLResult.nodes('/root/*') x(a)
   ) JSON(theLine)
  FOR XML PATH(''),TYPE).value('.','NVARCHAR(MAX)' )
,1,1,'')
IF @Rowcount>1 RETURN '['+@JSONVersion+'
]'
RETURN @JSONVersion
END

You can alter the routine to give you different formats. Here is a bare-bones version that gives you a rather similar document to the old CSV format that I’ve described earlier in the article.

DECLARE @cars xml;

SET @cars =

'<?xml version="1.0" encoding="UTF-8"?>

<root>

  <row>

    <Make>Volkswagen</Make>

    <Model>Eurovan</Model>

    <Year>2003</Year>

    <Color>White</Color>

    <Mileage>56,830</Mileage>

  </row>

  <row>

    <Make>Honda</Make>

    <Model>CRV</Model>

    <Year>2009</Year>

    <Color>Black</Color>

    <Mileage>35,600</Mileage>

  </row>

</root>';

SELECT

        '{

  "columns":['--start by creating a list of all the column headings, like the CSV header-line.

   +Stuff((SELECT ', "'+coalesce(x.a.value('local-name(.)', 'NVARCHAR(255)'),'')+'"'

           from @Cars.nodes('/root[1]/row[1]/*') x(a) --get first row

           for xml path(''),TYPE).value('(./text())[1]','NVARCHAR(MAX)')

   ,1,1,'')+'

  ],'--the end of the list of column headings

+ '

  "results": ['--create a list (rows) of lists (tuples)

  +Stuff( --we want to snip out the leading comma

    (SELECT TheLine from --this is to glue each row into a string

      (SELECT ',

      ['+ --this is the start of the row, representing the row object in the JSON list

        --the local-name(.) is an eXPath function that gives you the name of the node

        Stuff((SELECT ',"'+ b.c.value('text()[1]','NVARCHAR(MAX)') +'"'

               -- 'text()[1]' gives you the text contained in the node     

               from x.a.nodes('*') b(c) --get the row XML and split it into each node

               for xml path(''),TYPE).value('(./text())[1]','NVARCHAR(MAX)')

          ,1,1,'')+']'--remove the first comma

     from @Cars.nodes('/root/*') x(a) --get every row

     ) JSON(theLine) --each row

    for xml path(''),TYPE).value('.','NVARCHAR(MAX)' )

  ,1,1,'')--remove the first leading comma

  +'

  ]

}'

 ... which gives the JSON

{

  "columns":[ "Make", "Model", "Year", "Color", "Mileage"

  ],

  "results": [

      ["Volkswagen","Eurovan","2003","White","56,830"],

      ["Honda","CRV","2009","Black","35,600"]

  ]

}

More Complex JSON

Sometimes, you don’t want flattened JSON.  Sometimes you have attributes, for example. You might have a more complex hierarchy.  The FOR XML PATH syntax can produce truly startling XML. Just to test our routine out though, we’ll  first take a classic XML example from the JSON documentation.

Declare @XMLSample XML

SELECT @XMLSample='

  <glossary><title>example glossary</title>

  <GlossDiv><title>S</title>

   <GlossList>

    <GlossEntry ID="SGML" SortAs="SGML">

     <GlossTerm>Standard Generalized Markup Language</GlossTerm>

     <Acronym>SGML</Acronym>

     <Abbrev>ISO 8879:1986</Abbrev>

     <GlossDef>

      <para>A meta-markup language, used to create markup languages such as DocBook.</para>

      <GlossSeeAlso OtherTerm="GML" />

      <GlossSeeAlso OtherTerm="XML" />

     </GlossDef>

     <GlossSee OtherTerm="markup" />

    </GlossEntry>

   </GlossList>

  </GlossDiv>

 </glossary>'

 

DECLARE @MyHierarchy Hierarchy -- to pass the hierarchy table around

insert into @MyHierarchy SELECT * from dbo.ParseXML(@XMLSample)

SELECT dbo.ToJSON(@MyHierarchy)

Which would give you

{

"title" : "example glossary",

"GlossDiv" :   {

  "title" : "S",

  "GlossList" :   {

    "GlossEntry" :   {

      "GlossTerm" : "Standard Generalized Markup Language",

      "Acronym" : "SGML",

      "Abbrev" : "ISO 8879:1986",

      "GlossDef" :   {

        "para" : "A meta-markup language, used to create markup languages such as DocBook.",

        "GlossSeeAlso" :   {

          "OtherTerm" : "GML"

          },

        "GlossSeeAlso" :   {

          "OtherTerm" : "XML"

          }

        },

      "GlossSee" :   {

        "OtherTerm" : "markup"

        },

      "ID" : "SGML",

      "SortAs" : "SGML"

      }

    }

  }

}

 

What we’ve done here is to convert the result into an intermediary SQL Table which is defined by a hierarchy type. Markup languages such as JSON and XML all represent object data as hierarchies.  We then pass it to a separate function, ToJSON, to render it as JSON.

The first trick is to represent it as a Adjacency list hierarchy in a table. This Adjacency list is really the Database equivalent of any of the nested data structures that are used for the interchange of serialized information with the application, and can be used to create XML, CSV, OSX Property lists, Python nested structures or YAML as easily as JSON.

Adjacency list tables have the same structure whatever the data in them. This means that you can define a single Table-Valued Type and pass data structures around between stored procedures. However, they are best held at arms-length from the data, since they are not relational tables, but something more like the dreaded EAV (Entity-Attribute-Value) tables. Converting the data from its Hierarchical table form will be different for each application, but is easy with a CTE.

It is unlikely that we’d have to produce JSON from raw XML as we did in the first example, but you might be faced by a SQL query like this, embedded in this SQL harness

You'll need to create a Hierarchy table type first, but we'll get to that in a moment.

DECLARE @MyHierarchy Hierarchy -- to pass the hierarchy table around

insert into @MyHierarchy SELECT * from dbo.ParseXML((

--->The Query starts here

SELECT ProductModelID AS "@ProductModelID",

      Name AS "@ProductModelName",

      (SELECT top 20 ProductID AS "data()"

       FROM   AdventureWorks2012.Production.Product p

       WHERE  p.ProductModelID = p.ProductModelID

       FOR XML PATH ('')

       ) AS "@ProductIDs",

       (

       SELECT top 10 Name AS "ProductName"

       FROM   AdventureWorks2012.Production.Product pp

       WHERE  pp.ProductModelID =

              pp.ProductModelID

       FOR XML PATH (''), type

       ) AS "ProductNames"

      

FROM AdventureWorks2012.Production.ProductModel

WHERE ProductModelID= 7 OR ProductModelID=9

FOR XML PATH('ProductModelData')

---> Query stops here

))

SELECT dbo.ToJSON(@MyHierarchy)

... which gives this:

{

"ProductNames" :   [

  "HL Road Frame - Black, 58",

  "HL Road Frame - Red, 58",

  "Sport-100 Helmet, Red",

  "Sport-100 Helmet, Black",

  "Mountain Bike Socks, M",

  "Mountain Bike Socks, L",

  "Sport-100 Helmet, Blue",

  "AWC Logo Cap",

  "Long-Sleeve Logo Jersey, S",

  "Long-Sleeve Logo Jersey, M"

  ],

"ProductModelID" : 9,

"ProductModelName" : "LL Road Frame",

"ProductIDs" : "680 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724"

}

The code to do all this is attached to this article, but it is worth going through the ParseXML function that takes an XML fragment or document and creates a SQL Hierarchy table from it. It can be used to inspect the contents of XML documents, of course, but it  is primarily for interchange and for getting the contents of an XML table when you don’t know the structure up-front well-enough to use XPath, or if, like me, your brain isn’t quite big enough to understand the full expanse of XPath.

Basically, the routine starts by saving the base nodes (the root node if it is there) in a temporary table along with their name and then repeatedly extracting any element and attribute associated with it, and all its nested nodes until it has done them all. It then works out what sort of DataType each one is. It does all this simply by exploring the XML Structure without prior knowledge, using XPath  wildcard syntax, and then using an XPath function ‘local-name()’ to determine the name of each node. It uses ‘text()[1]'’ to get the value, if any, associated with the node and also finds any attributes associated with the node. It does not make any difference between an attribute or element, since no other document format seems to care. Although I got a lot of inspiration for this routine from XMLTable routine of  the great Jacob Sebastian  (whose book on XSD should be on your bedside table) I didn’t follow the temptation to use a recursive CTE. No sir, iteration is good enough for me, and a lot quicker, it turns out! Jacob was the pioneer, and still wins the elegance prize for  XMLTable, though.

Here first is the hierarchy table type, which we use to pass between functions and procedures

CREATE TYPE dbo.Hierarchy AS TABLE

/*Markup languages such as JSON and XML all represent object data as hierarchies. Although it looks very different to the entity-relational model, it isn't. It is rather more a different perspective on the same model. The first trick is to represent it as a Adjacency list hierarchy in a table, and then use the contents of this table to update the database. This Adjacency list is really the Database equivalent of any of the nested data structures that are used for the interchange of serialized information with the application, and can be used to create XML, OSX Property lists, Python nested structures or YAML as easily as JSON.

 

Adjacency list tables have the same structure whatever the data in them. This means that you can define a single Table-Valued  Type and pass data structures around between stored procedures. However, they are best held at arms-length from the data, since they are not relational tables, but something more like the dreaded EAV (Entity-Attribute-Value) tables. Converting the data from its Hierarchical table form will be different for each application, but is easy with a CTE. You can, alternatively, convert the hierarchical table into XML and interrogate that with XQuery

*/

(

   element_id INT primary key, /* internal surrogate primary key gives the order of parsing and the list order */

   sequenceNo int NULL, /* the place in the sequence for the element */

   parent_ID INT,/* if the element has a parent then it is in this column. The document is the ultimate parent, so you can get the structure from recursing from the document */

   [Object_ID] INT,/* each list or object has an object id. This ties all elements to a parent. Lists are treated as objects here */

   NAME NVARCHAR(2000),/* the name of the object, null if it hasn't got one */

   StringValue NVARCHAR(MAX) NOT NULL,/*the string representation of the value of the element. */

   ValueType VARCHAR(10) NOT null /* the declared type of the value represented as a string in StringValue*/

)

Go

Now here is the code for the function that parses XML to fill a hierarchy table

IF OBJECT_ID (N'dbo.ParseXML') IS NOT NULL

   DROP FUNCTION dbo.ParseXML

GO

CREATE FUNCTION dbo.ParseXML( @XML_Result XML)

/*

Returns a hierarchy table from an XML document.

Author: Phil Factor

Revision: 1.2

date: 1 May 2014

example:

 

DECLARE @MyHierarchy Hierarchy

INSERT INTO @myHierarchy

SELECT* from dbo.ParseXML((SELECT* from adventureworks.person.contact where contactID in (123,124,125) FOR XML path('contact'), root('contacts')))

SELECTdbo.ToJSON(@MyHierarchy)

 

DECLARE @MyHierarchy Hierarchy

INSERT INTO @myHierarchy

SELECT* from dbo.ParseXML('<root><CSV><item Year="1997" Make="Ford" Model="E350" Description="ac, abs, moon" Price="3000.00" /><item Year="1999" Make="Chevy" Model="Venture &quot;Extended Edition&quot;" Description="" Price="4900.00" /><item Year="1999" Make="Chevy" Model="Venture &quot;Extended Edition, Very Large&quot;" Description="" Price="5000.00" /><item Year="1996" Make="Jeep" Model="Grand Cherokee" Description="MUST SELL!&#xD;&#xA;air, moon roof, loaded" Price="4799.00" /></CSV></root>')

SELECTdbo.ToJSON(@MyHierarchy)

 

*/

RETURNS @Hierarchy TABLE

 (

    Element_ID INT PRIMARY KEY, /* internal surrogate primary key gives the order of parsing and the list order */

    SequenceNo INT NULL, /* the sequence number in a list */

    Parent_ID INT,/* if the element has a parent then it is in this column. The document is the ultimate parent, so you can get the structure from recursing from the document */

    [Object_ID] INT,/* each list or object has an object id. This ties all elements to a parent. Lists are treated as objects here */

    [Name] NVARCHAR(2000),/* the name of the object */

    StringValue NVARCHAR(MAX) NOT NULL,/*the string representation of the value of the element. */

    ValueType VARCHAR(10) NOT NULL /* the declared type of the value represented as a string in StringValue*/

 )

   AS

 BEGIN

 DECLARE  @Insertions TABLE(

     Element_ID INT IDENTITY PRIMARY KEY,

     SequenceNo INT,

     TheLevel INT,

     Parent_ID INT,

     [Object_ID] INT,

     [Name] VARCHAR(50),

     StringValue VARCHAR(MAX),

     ValueType VARCHAR(10),

     TheNextLevel XML,

     ThisLevel XML)

    

 DECLARE @RowCount INT, @ii INT

 --get the base-level nodes into the table

 INSERT INTO @Insertions (TheLevel, Parent_ID, [Object_ID], [Name], StringValue, SequenceNo, TheNextLevel, ThisLevel)

  SELECT   1 AS TheLevel, NULL AS Parent_ID, NULL AS [Object_ID],

    FirstLevel.value('local-name(.)', 'varchar(255)') AS [Name], --the name of the element

    FirstLevel.value('text()[1]','varchar(max)') AS StringValue,-- its value as a string

    ROW_NUMBER() OVER(ORDER BY (SELECT 1)) AS SequenceNo,--the 'child number' (simple number sequence here)

    FirstLevel.query('*'), --The 'inner XML' of the current child 

    FirstLevel.query('.')  --the XML of the parent

  FROM @XML_Result.nodes('/*') a(FirstLevel) --get all nodes from the XML

 SELECT @RowCount=@@RowCount --we need this to work out if we are rendering an object or a list.

 SELECT @ii=2

 WHILE @RowCount>0 --while loop to avoid recursion.

  BEGIN

  INSERT INTO @Insertions (TheLevel, Parent_ID, [Object_ID], [Name], StringValue, SequenceNo, TheNextLevel, ThisLevel)

   SELECT --all the elements first

   @ii AS TheLevel, --(2 to the final level)

     a.Element_ID, --the parent node

     NULL, --we do this later. The object ID is merely a surrogate key to distinguish each node

     [then].value('local-name(.)', 'varchar(255)') AS [name], --the name

     [then].value('text()[1]','varchar(max)') AS [value], --the value

     ROW_NUMBER() OVER(PARTITION BY a.Element_ID ORDER BY (SELECT 1)),--the order in the sequence

     [then].query('*'), --the 'inner' XML for the node

     [then].query('.') --the XML from which this node was extracted

   FROM   @Insertions a

     CROSS apply a.TheNextLevel.nodes('*') whatsNext([then])

   WHERE a.TheLevel = @ii - 1 --only look at the previous level

  UNION ALL -- to pick out the attributes of the preceding level

  SELECT @ii AS TheLevel,

     a.Element_ID,--the parent node

     NULL,--we do this later. The object ID is merely a surrogate key to distinguish each node

     [then].value('local-name(.)', 'varchar(255)') AS [name], --the name

     [then].value('.','varchar(max)') AS [value],--the value

     ROW_NUMBER() OVER(PARTITION BY a.Element_ID ORDER BY (SELECT 1)),--the order in the sequence

   '' , ''--no nodes

   FROM   @Insertions a 

     CROSS apply a.ThisLevel.nodes('/*/@*') whatsNext([then])--just find the attributes

   WHERE a.TheLevel = @ii - 1 OPTION (RECOMPILE)

  SELECT @RowCount=@@ROWCOUNT

  SELECT @ii=@ii+1

  END;

  --roughly type the DataTypes (no XSD available here)

 UPDATE @Insertions SET

    [Object_ID]=CASE WHEN StringValue IS NULL THEN Element_ID

  ELSE NULL END,

    ValueType = CASE

     WHEN StringValue IS NULL THEN 'object'

     WHEN  LEN(StringValue)=0 THEN 'string'

     WHEN StringValue LIKE '%[^0-9.-]%' THEN 'string'

     WHEN StringValue LIKE '[0-9]' THEN 'int'

     WHEN RIGHT(StringValue, LEN(StringValue)-1) LIKE'%[^0-9.]%' THEN 'string'

     WHEN  StringValue LIKE'%[0-9][.][0-9]%' THEN 'real'

     WHEN StringValue LIKE '%[^0-9]%' THEN 'string'

  ELSE 'int' END--and find the arrays

 UPDATE @Insertions SET

    ValueType='array'

  WHERE Element_ID IN(

  SELECT candidates.Parent_ID

   FROM

   (

   SELECT Parent_ID, COUNT(*) AS SameName

    FROM @Insertions --where they all have the same name (a sure sign)

    GROUP BY [Name],Parent_ID --no lists in XML

    HAVING COUNT(*)>1) candidates

     INNER JOIN  @Insertions insertions

     ON candidates.Parent_ID= insertions.Parent_ID

   GROUP BY candidates.Parent_ID

   HAVING COUNT(*)=MIN(SameName))--

 INSERT INTO @Hierarchy (Element_ID,SequenceNo, Parent_ID, [Object_ID], [Name], StringValue,ValueType)

  SELECT Element_ID, SequenceNo, Parent_ID, [Object_ID], [Name], COALESCE(StringValue,''), ValueType

  FROM @Insertions--and insert them into the hierarchy.

 RETURN

 END

You can, of course, use a routine like this to get XML into a relational format but it is much easier to use the XML VALUE() method  together with the NODE() method to do this sort of work for you.

SELECT parent_ID,

  convert(varchar(20),Max(Case when Name='Make' then stringValue else '' end)) as [Make],

  convert(varchar(20),max(Case when Name='Model' then stringValue else '' end)) as [Model],

  convert(int,max(Case when Name='Year' then stringValue else '' end)) as [Year],

  convert(varchar(10),max(Case when Name='Color' then stringValue else '' end)) as [Color],

  convert(int,replace(max(Case when Name='Mileage' then stringValue else '' end),',','')) as [Mileage]

  from (SELECT Element_ID, SequenceNo, Parent_ID, [Object_ID],

               Name, StringValue, Valuetype

        from dbo.ParseXML(

'<?xml version="1.0" encoding="UTF-8"?>

<root>

  <row>

    <Make>Volkswagen</Make>

    <Model>Eurovan</Model>

    <Year>2003</Year>

    <Color>White</Color>

    <Mileage>56,830</Mileage>

  </row>

  <row>

    <Make>Honda</Make>

    <Model>CRV</Model>

    <Year>2009</Year>

    <Color>Black</Color>

    <Mileage>35,600</Mileage>

  </row>

</root>')) x(Element_ID, SequenceNo,

   Parent_ID, [Object_ID], Name, StringValue, Valuetype)

where parent_ID>1

group by Parent_ID   

Which would give:

parent_ID   Make            Model           Year        Color      Mileage

----------- --------------- --------------- ----------- ---------- -----------

2           Volkswagen      Eurovan         2003        White      56830

3           Honda           CRV             2009        Black      35600

 

(2 row(s) affected)

Conclusions

If you can do so, it is nowadays often much easier for the application developer to deal with XML or JSON than with a SQL Result, particularly if the data required is hierarchical in nature.  If the volume of data isn’t great, then it should be possible to deliver this quickly without undue CPU burden on the server, thereby avoiding a separate conversion process within the application.  We already do this with XML so why not JSON, or any other format  required?

SQL Server isn’t geared up for complex string manipulation, and isn’t therefore an obvious place to engage in this sort of activity. We need native JSON-integration in relational databases just as we already have XML-integration. After all, Sybase and PostgreSQL already have it so why not SQL Server?. Occasionally, the data  as seen from the application perspective just isn’t table-based, and it is much easier, and sometimes quicker,  to deliver what the application requires in a format it can consume.  Whilst we await native JSON-integration, we must do the best we can with the tools that are available

Further Reading

Phil Factor

Author profile:

Phil Factor (real name withheld to protect the guilty), aka Database Mole, has 30 years of experience with database-intensive applications. Despite having once been shouted at by a furious Bill Gates at an exhibition in the early 1980s, he has remained resolutely anonymous throughout his career. See also :

Google + To translate this article...

Search for other articles by Phil Factor

Rate this article:   Avg rating: from a total of 29 votes.


Poor

OK

Good

Great

Must read
Have Your Say
Do you have an opinion on this article? Then add your comment below:
You must be logged in to post to this forum

Click here to log in.


Subject: Complex
Posted by: Anonymous (not signed in)
Posted on: Friday, May 9, 2014 at 9:56 AM
Message: Or use a document database :)

Subject: Sometimes .NET is better then TSQL
Posted by: Anonymous (not signed in)
Posted on: Monday, May 12, 2014 at 3:39 AM
Message: A .net stored proc running in sql server could convert the XML to JSON a lot better than TSQL can.

However given the option I would just return the XML to a service written in .net that then transforms it to JSON and passed it on to the application.

(The connection between .NET DataReader and SQLServer does not sent XML as text, it uses a much more compact format, only converting it to text in the SQLDataReader class)

Subject: Looks like it's time for SQL to add JSON
Posted by: Paulo.Morgado (view profile)
Posted on: Monday, May 12, 2014 at 3:41 AM
Message: Looks like it's time for SQL to add JSON as type like it has XML.

Subject: T-SQL needs a JSON function
Posted by: Robert Keyes (not signed in)
Posted on: Monday, May 12, 2014 at 4:00 AM
Message: Thanks for the article! Very interesting and useful. JSON is a major player now. Comon T-SQL - time to play some catch up.

Subject: Re: Sometimes .NET is better than TSQL
Posted by: Phil Factor (view profile)
Posted on: Monday, May 12, 2014 at 4:21 AM
Message: Sure. SQL Server is pretty good at providing XML results in a variety of forms from relational data, but needs a fast, reliable way of converting this to JSON. I would go for a clean and simple solution that leverages all that XPath goodness. Doing this in SQL is rather a desperate act. One could use a CLR routine or an external .NET service to do the translation, but for several purposes it is far cleaner, and more portable, to have it built into the SQL Standard! In the meantime, JSON.NET, the best .NET JSON library I reckon, already does a two-way conversion between XML and JSON,

Subject: Re: Sometimes .NET is better than TSQL
Posted by: Paulo.Morgado (view profile)
Posted on: Monday, May 12, 2014 at 5:03 AM
Message: I wouldn't say converting between XML and JSON.

Just produce JSON with a FOR JSON clause.

Of course, having a JSON type would be great, but I don't think it's viable to convert automatically. If the developer needs it, she/he has to convert it.

Subject: Finally, an end to ORMs?
Posted by: Ron Hudson (not signed in)
Posted on: Saturday, May 24, 2014 at 9:45 AM
Message: Phil,

I have been waiting over 6 years now for someone else to suggest that we don't always need over-engineered spaghetti ORM code for "business logic." Unfortunately, many of these young Computer Science guys seem hell-bent on keeping us locked in the obsolete OOD world of the 1980's for business logic (I suspect it's because those are the languages their professors are most-knowledgeable about). Based on my 20 years of MS/web dev experience, relational platforms like MS SQL are a much-better place to put application and business logic in most-cases. I think that these guys who only know how to implement logic in C# or Java have been holding us back for the last 5 years with their OOD hoohaa. I think that the future of web apps is AJAX to sprocs to JSON..no need for wasting time and money on "pie-in-the-sky" OO middle tiers. Thank you Phil for being the first person I've seen acknowledge the feasibility of this approach. I think that Computer Science programs need to include Javascript, AJAX, JSON, and TSQL if their graduates are going to help move the industry forward. The Java and C# OOD focus is holding everything back at this point IMO...maybe it's time to move on from OOD fantasy designs.

Subject: AAAaarrrrrrrrrghhhh my eyes!
Posted by: piers7 (view profile)
Posted on: Wednesday, June 4, 2014 at 3:45 AM
Message: *cough* CLR TVF *cough*

Subject: Brillian...but...
Posted by: acmarsden (view profile)
Posted on: Tuesday, November 25, 2014 at 7:29 AM
Message: The schema I have to get to wraps multiple child elements in square brackets

"distribution": [
{
"downloadURL": "https://data.some.org/the/actual/catalog/datasets/1/resources/1.csv",
"format": "text/csv",
},
{

"downloadURL": "",
"format": "",
}
],

My output doesn't have that - have I missed something?

Many thanks

Subject: Possible Bug in FlattenedJSON & ToJSON scripts
Posted by: sudpank (view profile)
Posted on: Friday, March 20, 2015 at 10:13 AM
Message: Hi Phil,

Thanks for the great article on JSON from T-SQL. I ran your query for the example that you have provided just to test it. Your FlattenedJSON can't process more than 250 rows and ToJSON can't handle more than 133 records. If you reduce the number of columns to 2, it can only produce 770 JASON records. Any idea, what might be causing this. I ran it on SQL Server 2008 R2 (SP2) Express Edition. Where is the limitation coming from?

Subject: @Sudpank
Posted by: Phil Factor (view profile)
Posted on: Friday, March 20, 2015 at 10:34 AM
Message: Re: Possible Bug in FlattenedJSON & ToJSON scripts
I'll take a look. Something seems to have crawled in because I tested it to more rows than that!

Subject: @Sudpank
Posted by: Phil Factor (view profile)
Posted on: Friday, March 20, 2015 at 11:28 AM
Message: Hmm. I've just retested it and it works fine up to 600 rows (it takes a long time to I didn't test it for more). This is a bit of a head-scratcher. The only think I can think of is that you've forgotten that SSMS only displays a limited length of string of 8092 characters. If this is the case, you can export the data to a flat file which will not be truncated. To do this:

Right-click the Database and then click Tasks -> Export Data
Select your Data Source and choose "Flat File Destination" for the Destination type. Then, choose a file name for the output.
On the menu item "Specify Table Copy or Query", choose "Write a query to specify the data to transfer". Then paste in the query
The rest should be pretty obvious. This will output the file to text and you can open it in notepad or whatever text editor you choose.

When I do this sort of work I use a stored procedure of my own to save JSON files to disk that I've already published on the site, called spSaveTextToFile. which you can get here https://www.simple-talk.com/sql/t-sql-programming/the-tsql-of-text-files/

Subject: Native JSON implementation
Posted by: jocapc (view profile)
Posted on: Sunday, May 17, 2015 at 4:30 PM
Message: Note that SQL Server 2016 will have native functions/clauses for parsing JSON (OPENJSON) and producing JSON (FOR JSON)

 
Simple-Talk Database Delivery

DLM
Patterns & Practices Library

Visit our patterns and practices library to learn more about database lifecycle management.

Find out how to automate the process of building, testing and deploying your database changes to reduce risk and make rapid releases possible.

Get started

Phil Factor
Documenting your SQL Server Database

One of the shocks that a developer can get when starting to program in T-SQL is that there is no simple way of... Read more...

 View the blog

Top Rated

Getting to know your customers better – cohort analysis and RFM segmentation in R
 It often pays to use a tool like R, in conjunction with a relational database, to quickly perform a... Read more...

A Start with Automating Database Configuration Management
 For a number of reasons, it pays to have the up-to-date source of all the databases and servers that... Read more...

Archiving Hierarchical, Deleted Transactions Using XML
 When you delete a business transaction from the database, there are times when you might want to keep a... Read more...

Rollback and Recovery Troubleshooting; Challenges and Strategies
 What happens if your database deployment goes awry? Do you restore from a backup or snapshot and lose... Read more...

MySQL Compare: The Manual That Time Forgot, Part 1
 Although SQL Compare, for SQL Server, is one of Red Gate's best-known products, there are also 'sister'... Read more...

Most Viewed

Beginning SQL Server 2005 Reporting Services Part 1
 Steve Joubert begins an in-depth tour of SQL Server 2005 Reporting Services with a step-by-step guide... Read more...

Ten Common Database Design Mistakes
 If database design is done right, then the development, deployment and subsequent performance in... Read more...

SQL Server Index Basics
 Given the fundamental importance of indexes in databases, it always comes as a surprise how often the... Read more...

Concatenating Row Values in Transact-SQL
 It is an interesting problem in Transact SQL, for which there are a number of solutions and... Read more...

Temporary Tables in SQL Server
 Temporary tables are used by every DB developer, but they're not likely to be too adventurous with... Read more...

Why Join

Over 400,000 Microsoft professionals subscribe to the Simple-Talk technical journal. Join today, it's fast, simple, free and secure.