21 October 2013

Creating a Business Intelligence Dashboard with R and ASP.NET MVC: Part 3

By introducing R libraries into an ASP.NET MVC application, it is possible to broaden its analytical and visualization power immensely. To illustrate this, Sergei shows an example of how to use the R libraries Lattice and GoogleVis to detect poorly-performing product categories, the quantity sold and the profit from them, and estimate their demand over time.

In the first two parts of this series of articles, I set the groundwork for a solution that used ASP.NET MVC to create a simple site for a fictitious company called NorthWind Traders with a SQL Server database and which used R for the analytics and visualisation. Now, in the third part of the series, I’ll build on this by introducing two new R libraries that widen the analytical power of the application that we began developing in part 1 and 2. These libraries are called lattice and googleVis.

The name of the latter one leaves few doubts about its nature, whereas the former name sounds a bit peculiar (at least for some people whose first language is not a Germanic one). The word lattice has many meanings that can be found in Wikipedia but it is better to give you a quote from the book Lattice: Multivariate Data Visualisation with R by Deepayan Sarkar. Here it is “The lattice package is software that extends the R language and environment for statistical computing by providing a coherent set of tools to produce statistical graphics with an emphasis on multivariate data“.

Only a relatively small number of people in history could abstract a complex physical phenomenon such as that falling apple into a mathematical model expressed by a simple algebraic equation. “Simplicity is the ultimate sophistication.” (da Vinci). In our context it is a unique ability or/and talent to cut unimportant variables to make a simple equation. The less fortunate of us have to deal with multiple variables, build statistical model and use computer simulation. Because I am going to focus on graphics of multivariate data in this article, the lattice package is a logical choice to unravel those relationships.

An animated, interactive, visualisation can provide an additional insight into relationship between variables in a data set. This approach was effectively popularised by Professor Hans Rosling in his TED Talks in 2007 when he presented Trendanalyser – a software package that converts data into animated, interactive charts. It was developed by the Gapminder Foundation that he co-founded earlier.

One year later Google released Motion Chart (currently known as Google Charts) – a flash based charting library to plot animated charts. The googleVis package provides an interface between R and the Google Charts as well as makes use of the internal R HTTP server to display the output locally. We will try to integrate an output from the R HTTP server into our application.

Implementing the Category Performance Report

Let me remind you that the Parts 1 and 2 of the series demonstrated a process of building the first page (see Figure 1) of a web based business intelligence application for a fictitious wholesale company Northwind Traders that sells exotic food.


Figure 1. The dashboard landing page.

In its current state the application uses four R libraries: RODBC, R2HTML, PLYR, and GGPLOT2. All R routines were collated into a batch file which could accept parameters and the results of its scheduled execution (two HTML fragments and two images) are hosted by the ASP.NET MVC 4 page above which is styled using Twitter Bootstrap (bootstrap). The navigation toolbar and page layout were also developed with bootstrap.

The next page we will add to the dashboard is a Products page and the first report on this page the Category Performance report. This report is displayed on the Figure 2 below. It shows sales volume and profit as a percentage of their total values for 1998. You might have noticed that the height of the magenta portion of the volume bar is nearly twice that of its profit counterpart. This might indicate that the seafood profit is inadequate to the cost of sales in comparison to other categories and that the performance of the seafood category requires some improvement.

Let me be more specific on this. Every product requires some effort to sell it. Let’s assume we have two products x and y that belong to the same category. The product x contribution into the overall volume is 20% and product y – 10%. It means that the effort to sell product x is about two times more than the effort to sell the product y. On the other hand the product y contribution into the overall profit is 20%, and the product x – 10%. This is why we call the product x – a poorly performing one. This conclusion also applies to the Confections category.

The Listing 1 shows the R procedure that was used to create the Category Performance report below.


Figure 2. Product categories performance in 1998.

Listing 1. plotCategoryPerformance.r procedure.

Most of the code in the Listing 1 is similar to the analogous procedures from parts 1 and 2 of the series with just a couple of exceptions. This 100 * ctgGP$MeasureValue / sum(ctgGP$MeasureValue ) statement converts the absolute value into the percentage of total for each category and rbind(ctgQty, ctgGP) combines (like T-SQL UNION operator) two data frames, ctgQty and ctgGP by rows to produce a new data frame ctgPmf showed below.


Figure 3. ctgPmf data frame.

The following T-SQL view was used by sqlFetch() function from the RODBC library to populate the ctgGP data frame

Listing 2. T-SQL view Ch2_Category_PmfGP_vw

It won’t be difficult to figure out what changes to the above view are required to make it deliver Quantity instead of the Profit figures and then use the updated view to populate the ctgQty data frame.

Now that we know that some categories have inadequate performance, we will create the next report that should shed some light on whether there is a chance of improving their performance.

Implementing the Product Category Demand Report

What is a demand graph? According to Wikipedia “… the demand curve is the graph depicting the relationship between the price of a certain commodity and the amount of it that consumers are willing and able to purchase at that given price.” And “Demand curves are used to estimate behaviours in competitive markets… “.

If we look at the graph of Seafood demand in 1998 (Fig. 5) we notice that Northwind customers bought much more at lower prices than they did when prices were higher. The relationship between price and quantity that is depicted in Figure 4 shows the responsiveness, or elasticity, of the quantity of seafood purchased as its price changed.

Using statistical modelling, we could infer a mathematical relationship (simply speaking – a formula) between profit, price, cost, and quantity. This formula would show whether the volume increase due to the price decrease can compensate for the losses (or even often uplift profit) due to the price reduction. This concept of nonlinear relationship between various factors is not uncommon in science or/and business and is illustrated by the graph below (Fig. 4), where Po – optimal price, Pcur – current price.


Figure 4. Relationship between price, volume, and profit for a price elastic product.1888-demand-81d42869-65bc-400d-a345-c6b8

Figure 5. Confections and Seafood products demand.

The R procedure that was used to create the above report is shown below.

Listing 3. R routine plotCtgDemand.r

We can use the RODBC package just as we did in Parts 1 and 2 of the series in order to connect to the database, retrieve the data and save it in the data frame ctgDmd. A few first and last rows of the data frame shown below.




Figure 6. ctgDmd data frame from Listing 3.

The lattice function xyplot(), that charts the data frame ctgDmd, has a signature common with other lattice functions, i.e. function_name ( y ~ x | A + B, data = , … ). It says ‘display the relationship between variables y and x for every combination of factors A and B‘. In our case of xyplo (Qty ~ Price | Category + Year, data = ctgDmd type = c(“g”, “p”, “smooth”)) x and y are Qty and Price, and A and B are factors Category and Year respectively. The second argument specifies the data frame that holds those variables and factors. In the formula y ~ x | A + B you can interchangeably use symbol * instead of the +.

An ability to see relationship between two variables simultaneously in the context of available business dimensions (e.g. Category and Calendar as we’ve just seen) is a very valuable characteristic of a visualisation package.

Finally, here is the T-SQL view that populates the ctgDmd data frame.

Listing 4. Ch2_Category_Demand_vw T-SQL view.

The transformation CAST((1 – dbo.[Order Details].Discount) * dbo.[Order Details].UnitPrice + 0.5 AS int) was used to merge price points within the same dollar into one point and then to convert the price value to the integer. It was done to reduce the visual clutter when scatter points have close prices.

Finding Price Elastic Products

Now as we know product categories that need some sales performance adjustment, let’s drill down into individual products and find those which are sensitive (‘price-elastic’) to price changes. An R routine that identifies such products is shown below

Listing 5. findElasticSeafoodProducts.r procedure.

As we have used similar constructs in other procedures, most of the code above is probably understandable, but I’d like to draw your attention to the 7th line. The ddply function applies the user-defined function function(x) { cor(x$Price, x$Qty) } to its x argument (data frame sfProd) that returns the correlation (this is our elasticity) between Price and Qty for each product in the category. The result of this calculation is displayed in the table below


Table 1. Elasticity of the Seafood Products.

The purpose of the following two lines after the 7th line is to change the column names and extract only products that have high negative correlation between their price and quantity (0.6 or less). In other words we take only highly elastic products and then use their names to filter the original data frame sfProd and draw Price ~ Qty scatter plot for each product (see below) using the xyplot() lattice function.


Figure 7. Highly elastic seafood products.

Please bear in mind that we are dealing with a demo database and do not have enough data to fully justify word ‘highly’, but this concept can be helpful in real life.

The T-SQL view that feeds the findElasticSeafoodProducts.r procedure follows.

Listing 6. Ch2_Seafood_Products_vw T-SQL view.

This ROUND( (1.0 – dbo.[Order Details].Discount) * dbo.[Order Details].UnitPrice, 1) is done to make price points such as 14.72 and 14.70 indistinguishable and to reduce the noise.

So, what is the Carnarvon Tiger optimal price? Let’s assume that our goal is to get the maxim profit, which is

Profit = Quantity * (Price – Cost)

The linear regression line for the Carnarvon Tigers can be written as

Quantity = a * Price + b

After the substitution into Profit formula we have

Profit = a * Price2 + (b – a * Cost) * Price – b * Cost

If we use the liner regression model represented by the blue line on Figure 7, then the optimal price is ~ $60. If the Northwind Trades were found a supplier that providers the same product for $19.99 then the optimal price would be ~ $50.


Figure 8. Carnarvon Tigers Profit vs. Price when Cost = $19.99

If our goal was to get rid of the excessive inventory (it’s a perishable product with a ‘best before date’ after all), then we would calculate the optimal price differently. It will be covered in one the following parts of the series.

I leave it for you as an exercise to find the equation of the blue regression line on Figure 8. Hint: use the subset() and lm() functions.

Assembling all the pieces together including the report that we are going to build in the next section gives us the Product page shown on Figure 9.

Implementing the ‘Profit by Country’ Report Using googleVis Library

The report we are going to build is displayed in the top right corner on Figure 9. The R procedure that generates this report is as follows

Listing 7. Ch2_plotCategoriesProfitByCountry.r procedure

This procedure produces the html fragment displayed below with some lines (comments and data) removed to shorten the Listing 7. Although this code fragment is rather long to be included in the article, it explains clearly how the gvisMotionChart (see Listing 6) function works.

Listing 8. The html fragment made by the gvisMotionChart() function

It converts the R data frame ctgGP that is returned by the sqlFetch() function into a JavaScript array and then into a googleVis DataTable, which is passed to the MotionChart() that plots an animated interactive chart.


Figure 9. Product page

Here is the T-SQL Ch2_Profit_by_Country_vw view that is used to extract data for the googleVis chart


This part adds some analytical elements and an improved user experience (animated, interactive chart) to what was previously, in parts 1 and 2, just BI reporting. It shows how to detect poorly-performing product categories and estimate their demand over time. Once we know the poor-performing categories and the demand, we can then drill down to the product level and find those products which have a sales volume that is sensitive to price changes. This information can be used to improve the category’s performance or to reduce the inventory levels.

In the following parts I will demonstrate how the inferred relationship between product prices and sales volume for price elastic products can be integrated into the sales forecast.

Keep up to date with Simple-Talk

For more articles like this delivered fortnightly, sign up to the Simple-Talk newsletter

This post has been viewed 25093 times – thanks for reading.

  • Rate
    [Total: 20    Average: 4.5/5]
  • Share

Sergei Dumnov

View all articles by Sergei Dumnov

Related articles

Also in .NET

Posting Form Content via JavaScript

Web-based applications run smoother if instead of using the traditional form method, they use JavaScript to post data to the server and to update the user interface after posting data: It also makes it easier to keep POST and GET actions separated. SignalR makes it even slicker; it can even update multiple pages at the same time. Is it time to use JavaScript to post data rather than posting via the browser the traditional way?… Read more

Also in ASP

Relational Algebra and its implications for NoSQL databases

With the rise of NoSQL databases that are exploiting aspects of SQL for querying, and are embracing full transactionality, is there a danger of the data-document model's hierarchical nature causing a fundamental conflict with relational theory? We asked our relational expert, Hugh Bin-Haad to expound a difficult area for database theorists.… Read more

Also in ASP.NET

Introduction to Bower, Grunt, and Gulp in Visual Studio

Developers are now spending much more time writing and debugging JavaScript, CSS, HTML templates etc. Microsoft have responded by adding better tools for front-end development into their development environment, Visual Studio. In this article Jon Smith uses a sample application to demonstrate how he converted an existing ASP.NET MVC5 application to use these new tools… Read more

Also in BI

T-SQL Window Function Speed Phreakery: The FIFO Stock Inventory Problem

Sometimes, in the quest for raw SQL performance, you are forced to sacrifice legibility and maintainability of your code, unless you then document your code lavishly. Phil Factor's SQL Speed Phreak challenge produced some memorable code, but can SQL features introduced since then help to produce code that performs as well and is also easy to understand? Aunty Kathi investigates.… Read more
  • Anonymous

    Product page
    Hi, followed thru your Part 1 & 2. Some modifications with the new Visual Studio Express 2013 and bootstrap 3 but all going good. Stuck on how to get the Product page up. Can you guide how and where to go about constructing this and perhaps also post a copy. This is excellent project – just what I needed to get R live in enterprise. Many thanks.

  • Anonymous

    to get Products page up Visual Studio express 2013
    Edit the viewshared_layout.cshmtl and add in

    <div class="navbar-collapse collapse">
    <ul class="nav navbar-nav ">
    <li>@Html.ActionLink("KPIs", "Index", "Home")</li>
    You will see Contact and About in same place so emulate these by expanding out the list.
    In viewshome need to have the Product.cshtml and base this on the index.cshtml to start with.
    And in the home controller copy as for the index

    public ActionResult Product()
    { return view() )

    That does it. Product page loads.

  • Anonymous

    to get Products page up Visual Studio express 2013
    Edit the viewshared_layout.cshmtl and add in

    <div class="navbar-collapse collapse">
    <ul class="nav navbar-nav ">
    <li>@Html.ActionLink("KPIs", "Index", "Home")</li>
    You will see Contact and About in same place so emulate these by expanding out the list.
    In viewshome need to have the Product.cshtml and base this on the index.cshtml to start with.
    And in the home controller copy as for the index

    public ActionResult Product()
    { return view() )

    That does it. Product page loads.

  • Sergei Dumnov

    Product Page
    1) Create a new VS 2013 ASP.NET Web App MVC project.
    2) Open Index.cshtml view and comment out <div class="jumbotron">…</div> fragment.
    3) Open _Layout.cshtml shared view and do replace string About with string Products.
    4) Notice structure of rows and columns. Look at the <div class="row">…<div class="col-md-4">…</div></div> construct. Use Bootstrap help if it is unclear.
    If after that you still struggle… Registre on this web site and contact me again. I will be able to reply offline.

  • Sergei Dumnov

    Product Page (clarification)
    Here is how the link tag should look like
    <li>@Html.ActionLink("Products", "About", "Home")</li>
    after the bullet 3 change (see my prev. post).

  • hubertx

    Can R talk to web services?
    Thanks for the great article. a simple question from a R novice.

    Can R call web services instead of talking to database directly? Does R has config files?

  • hubertx

    Can R talk to web services?
    Thanks for the great article. a simple question from a R novice.

    Can R call web services instead of talking to database directly? Does R has config files?

  • Sergei Dumnov

    Can R talk to web services?
    1) Have a look at http://cran.r-project.org/web/packages/RCurl/index.html

    Also you can find some examples using HTTP protocol in my forthcoming article.

    2) There are several config files that can be found in R installation folders as well as in your Win profile folder or/and your working directory.

  • haastnz

    Could you post the products.cshtml. How does that GoogleVis bit work at the posting stage. I have all files sitting correctly and the html checked with this article but running it doesn’t do the GoogleVis plot.

  • Sergei Dumnov

    GoogleVis (Products.cshtml)
    @{ViewBag.Title = "Products";}

    <link href="~/Content/jquery.ui.all.css" rel="stylesheet" />
    <script src="~/scripts/jquery-2.0.3.min.js"></script>
    <script src="~/scripts/jquery.ui.core.js"></script>
    <script src="~/scripts/jquery.ui.widget.js"></script>
    <script src="~/scripts/jquery.ui.position.js"></script>
    <script src="~/scripts/jquery.ui.tooltip.js"></script>
    <script type="text/javascript">
    window.onload = function () {

    function setIFrameSizeFitToIt() {
    window.parent.document.getElementById("gviz").style.height = "420px";
    window.parent.document.getElementById("gviz").style.width = "475px";

    <div class="container-fluid">
    <div class="row-fluid">
    <div class="span6">
    <p class="lead">
    Confections and Seafood Products Demand by Year
    <img class="img-polaroid" src="~/Content/demand.png" style="width: 460px; height: 415px;" />

    <div class="span6">
    <p class="lead">
    Country Categories GP vs AvgPrice and Qty
    <iframe id="gviz" src="~/Content/html/googleVis.html"></iframe>
    <div class="row-fluid">
    <div class="span6">
    <p class="lead">
    YTD Product Categories Performance
    <img class="img-polaroid" src="~/Content/ctgPmf.png" style="width: 460px; height: 340px;" />

    <div class="span6">
    <p id="hmTitle" class="lead">
    Elastic Seafood Products
    <img class="img-polaroid" src="~/Content/elc_products.png" style="width: 460px; height: 340px;" />
    <div class="row-fluid">
    <div class="span12">
    <br />
    <blockquote class="pull-right">
    <p>Above all else show the data.</p>
    <small><cite title="Edward Tufte">The Visual Display of Quantitative Information</cite></small>

  • haastnz

    GoogleVis Loads
    Thanks for the availablity of the Products.cshtml coding. I have GoogleVis presenting now. It is a wonderful series. I have trolled it thoroughly in the break and really appreciate your work making it available.

  • nam123

    need view

    Hi all,

    pls let me know from where i get Ch2_Category_PmfQty_vw view

  • vivane

    Integration issue
    Thank you for a nice tutorial.

    It seems that R procedures and fragments had to be generated (manually) then invoked by a web page automatically.

    How can we automate creation of R procedures and fragments dynamically based on user action on web page views?


  • Sergei Dumnov

    RE: Integration issue
    Thank you for your comments. Although I wrote this article not that long ago, the technology did change dramatically since that time. So I would advise you to look at
    1) https://rstudio.github.io/shinydashboard/
    2) https://powerbi.microsoft.com/en-us/documentation/powerbi-desktop-r-scripts/
    It is much better options to implement that the functionality you are looking for.
    Regards, Sergei

Join Simple Talk

Join over 200,000 Microsoft professionals, and get full, free access to technical articles, our twice-monthly Simple Talk newsletter, and free SQL tools.

Sign up