Click here to monitor SSC

Tony Davis

Simple-Talk Editor
News, views and good brews

Database Trampled by Godzilla View

Published Thursday, August 06, 2009 11:35 AM

For the older generation of SQL developers, discouraging the use of views is tantamount to accusing a respectable aging dowager of immorality. No way! They've always been such benign and conventional database objects.

Views are virtual tables, constructed from base tables and other views. They behave just like tables, and are a very simple means to "pre-assemble" data into an easily-queried format for the end user, thus protecting them from the underlying complexity of the SQL, and providing a layer of protection to your base tables.

Views are doubtless an occasionally-useful tool for the database developer, when designed wisely. Unfortunately, abuse of views has increased proportionately with the widening mismatch between the relational model and the object-oriented world of C# development. The result has been the increasing use of "Godzilla Views" that attempt to deliver entire business objects to LINQ or EF users, hiding dozens of tables, hundreds of columns, and almost always returning far more data than is really necessary. Like Godzilla, such views will inevitably turn on their creators and wreak awful havoc on the databases they infest, through the performance and maintenance problems they cause.

So when should you use views? Under what circumstances is it a bad idea? Much of the SQL literature is curiously silent on the issue. Surely, in the light of advice discouraging their use, it is time to take a reasoned stand and start a debate about what guidance IT departments should give over the use of views. I'd like to start by suggesting a few 'best practices'.

  • DO use views as the interface to grid controls, and so on, in applications that are designed to only work against table structures. You should never expose base tables to data controls
  • DON'T use views as a general-purpose means to deal with ad-hoc, or parameterized queries from applications. Use a properly-designed interface that is based on stored procedures, wherever possible.
  • DO use views to promote flexibility and portability – for example, isolating all query cross-references to a remote database in a single view.
  • DO consider the careful use of Table-Valued UDFs as an alternative to views, where you also need to perform processing, pass parameters and so on.

Help us finish the list, and correct the advice we've given. Better still, if you have view code on which you'd like a second opinion, send it to us and we'll get our authors to review it and potentially offer alternative solutions. We'd also like to hear of real instances of Godzilla views of frightening complexity which have caused difficulties with production systems. The best example will receive a prize.

Cheers,

Tony.

Comments

 

DBA Dave said:

The worst case I have seen is views nested to about 10 to 12 levels. And these were not little lookup type views. These were monsters to begin with. General ledger, transactions, billing and other double entry accounting monsters. Oh the nightmares!

Called from a front end that would have no care for a time filter. Then the end of month or financial year reports would get run. No other users could use the database. The clients thought a P2 with 512MB of RAM was a good machine. I am talking 2007, not 1999. The business would not impress on the clients to upgrade.

"The database will perform, make it happen!" was the call from the deck. "I cannot break the laws of physics, cap'n!", I would reply.

Trying to get an accountant to understand the concept of pagelife expectancy was an excercise in futility. The deep saw-like line graphs showing RAM being emptied and refilled as all other data was pushed out to satisfy a call to one of these views made me seasick.

45 seconds just to compile a call to fill a grid with client details...

After 12 months of getting nowhere and being percieved as a poor DBA because I could not "just make it work", I left for a better life.

August 9, 2009 9:45 PM
 

pjones said:

A fellow developer created a monster system in the belief that views were efficient. I've drummed it into him to use stored procedures for input and updates but he calls them with exec usrproc so they recompile every time. Then every single procedure queries views which are monsters. The query stats for longest running, most logical reads, cpu usage etc all show queries from this sytem using at least six times as many resources as anything else.  Then users at the end of a slow connection complain about the speed of data retrieval.
The whole system needs rewriting with a data acess layer and a databse re-design as it has been hacked about and rejigged to fit a user requirement that has changed considerably from the original.
August 10, 2009 3:42 AM
 

callcopse said:

Agree, excessive view usage is very obscuring to database maintenance.

However, can anyone tell me what this means?
"DO use views to promote flexibility and portability – for example, isolating all query cross-references to a remote database in a single view."

I have not got a SQLServerCentral login handy (shame!).
August 10, 2009 3:58 AM
 

NullPointer said:

@callcpose, I think what was meant there was that a good example of view usage would be replacing this:

select a,b,c,d, remotetable.x, remotetable.y from LocalTable
 inner join RemoteServer.SomeDatabase.dbo.RemoteTable on RemoteTable.ID = LocalTable.ID

with this:
select a,b,c,d,remotetable.x,remotetable.y from LocalTable
  inner join ViewOnRemoteServer V on v.ID = LocalTable.ID

Since the ViewOnRemoteServer is a separate object, if that server ever moves or is renamed or even the database name changes, you only need to update the view, not maybe hundreds of queries that reference a hard-coded remote server resource. (pardon the formatting in the above)
August 10, 2009 5:28 AM
 

NullPointer said:

On the subject of brutal views - we bought a system a couple of years back that added remote capabilities to our ERP system by means of sycnhronised data.
The system uses views to produce the data it sends on the the remote users.

One set of views had about 7 nesting levels (all of which where views) but the worst thing was that the entire hierarchy was triangular - so you had the lower level views actually referencing higher level views. (so for example the 4th view down joined itself on the 2nd down and so on). This meant that performance was exponentially destroyed. For 3000 rows of data this view used to take about 60 seconds. For 30,000 it used to run nearly an hour before being killed off.

I replaced most of the hierarchy with a single view from a single pre-rendered table and the process now runs in about 15 seconds for several million rows of of data.

The problem wasn't that the developers were poor coders; in fact a lot of the SQL in the individual views was well written, portable, and reasonably documented. They just had obviously grown used to programming SQL as though views were some kind of magic "assume data access is instantaneous" entities like they used to conjure up in Computer Science theory lectures!
August 10, 2009 5:37 AM
 

JonRobertson said:

I have a love-hate relationship with views.  The love grew slowly over time.  The hate came from very painful experiences.

Our previous database programmer discovered views and began using them for the new module he was writing.  That module uses a total of 18 views.  After he left, I was tasked with extending the module and decided that I could, ahem, improve the performance significantly if I indexed the views.

All went well during in-house development and testing.  And beta testing went reasonably well.  We saw some performance loss in another area of the application, but that area had seen a significant rewrite to provide some much needed functionality.  We worked to get that performance as high as we could, and wrote off the final performance loss as a cost of the new functionality.

Then we upgraded one of our largest customers that heavily used that module that I added the indexes to.  The customer started seeing deadlocks and deadlock victims all over the place, several times an hour.  Performance of the whole system was just crippling.  Not having a SQL expert on staff, we contracted a firm to help us resolve the deadlocks.  It turned out that the blocking/waiting required while the newly added view indexes were updated was crippling the rest of the database.  :(

In a completely different case, the data source for our OLAP database for 8 years was a set of views that were exclusively used for the OLAP database.  The advantage was that we could "prettify" the data and pull data from several tables into a single view, thus significantly reducing the number of fact tables needed in the OLAP schema.

Over the years, the nightly rebuild of the OLAP database has been a major performance issue.  We've seen OLAP databases that took 12 hours to build and thus were only rebuilt on the weekend.  We've seen several OLAP databases that threw "not enough memory" errors from SQL Server while trying to process the query, and thus we resorted to Enterprise Edition and cube partitions.

About a year ago, we hired a new DBA that had zero OLAP experience.  We put him through the typical in-house training that we've always used.  Then we finally convinced management that it would be worthwhile to send him to some Microsoft training classes.  Boy, did they pay off.  While discussing the issues we have with OLAP during his one-on-one training, the instructor was surprised to see we were using views for the data source.  It turns out that is a very bad idea.  :)  Changing from a view-based data source in the production database to a separate table-based data warehouse database allowed OLAP databases to build in 10 minutes rather than 4 hours.  The data warehouse is populated each night, using the original views.  But that process typically takes a few minutes more than the OLAP rebuild.  So we're now seeing the entire OLAP process taking about 10% of the time it was taking.

We still use views.  But we carefully evaluate each consideration of a new view and whether a view is the best solution.
August 10, 2009 7:33 AM
 

timothyawiseman@gmail.com said:

Like most features in SQL Server, views can be wonderful or terrible, depending on how they are used.  Here are a few I would consider best practices:

* Do use views if they will simplify and help refactor code.  (This can be particularly true if numerous procedures use the same or very similar complex series of joins.  Of course, this may also be a symptom that the table structure needs to be redesigned, but it may be unavoidable especially with fully normalized tables).
* Do NOT use views if they will make tracing code more complex, in particular nesting views especially deeply can be more trouble than it is worth.
* Do use views if an indexed view can greatly increase performance (This must be balanced against the performance reduction it will cause in read and update statments.  Also, there are considerations about when the query optimizer will use that index if you are not using Enterprise edition.  This is discussed at : http://www.sqlservercentral.com/articles/Indexed+Views/63963/ )
* Do use views to enforce certain types of query hints/table hints if they will be used throughout the code (but do this with full knowledge of the trade offs of those query hints.  Two examples are WITH (NOEXPAND) and WITH (NOLOCK), both of which can be of tremendous performance benefit, but come with drawbacks which must be carefully balanced.)
August 10, 2009 11:56 AM
 

Celko said:

Excellent piece and EXCELLENT comments!!   I have to add "Godzilla VIEWs" to my list of SQL design errors.  

"Mother Celko's Heuristics" included one that says a VIEW should make sense as a table  --3NF or better, logically coherent, models either an entity or a relationship (NOT an auxiliary table-- they need to be complete by their nature).  The idea is that you can give a "pseudo-schema" made of VIEWs to a subset of users (sales, shipping, accounting, customers, et al) that makes it look as if the database was built just for them instead of the enterprise as a whole.  It makes their projects easier and can hide enterprise level changes invisible.

My favorite example was a company that changed inventory valuation from average value to last paid in a volatile market.  Change one VIEW and the world is fine.  The same expense and purchase rules worked for any inventory evaluation rules.  
August 16, 2009 3:57 PM
 

r_honey said:

Well, I have my own way (rather peculiar, you may say) of working.

As a freelancer, I specialize in providing highly customized solutions to my clients. Taking an example of let's say, a Clinic, I have nested objects of the form:

invoicedMedicine.invoice.clinic

(nesting of objects can easily go 3-6 levels deep in my case)

Now, I prefer that the top-level object gets hydrated (filled with values) with all its sub-objects in one-go (i.e. one database interaction), rather having on-demand hydration.
Hydration on-demand in my opinion, presents situations where multiple connections are repeatedly opened & closed while processing a single client request, particularly when there are multiple work-flows which a client might follow depending upon data available. This presents serious performance implications in a web-based scenarios.

Now, this presented to me some problems, as to how to get the top-level object hydrated completely without issuing multiple SELECT commands.

My previous experience with Views (I had an application with an approx. 15 table join producing in excess of 30K rows) was not too pleasing.

So, now what I do is to create Inline table-valued UDFs where each UDF joins tables & other nested UDFs thus producing a composite table that can be used to hydrate any top-level object completely.

Further, the application itself simply invokes SPs passing them the parameters . The SPs themselves in turn use the appropriate UDF in the SELECT command to return the ResultSet to the client.

This approach has worked wonders for me. All my Business Objects no matter how deeply are they nested get fully hydrated in a single DB interaction. I find it maintainable as the bulk of the schema complexity is hidden behind the UDFs. However, I must say, I easily have 3-12 levels of nested UDFs.

I always wanted to compare this approach with Views. This article provides a perfect opportunity for that I believe. Anyone has any opinion on the pros/cons of this approach compared to traditional View approach.
August 17, 2009 11:13 AM
 

swimdad said:

We currently have a "view" built which combines every single, Contract, ServiceCode, Rate, Payer, Payer Account ever created in our system, to form the Godzilla of all data sources for a set of 4 drop down boxes where each are populated based on the selection in the one above it.

To top it off, it is only named View... but is actually a real table, rebuilt every night. This, alone with about 10 others just like it, some real view, some tables named view provided a "great" backbone for our billing, and reporting applications.  

THANK GOODNESS we are less then 3 weeks away from rolling out a completely new system. One with no tables named View..
September 2, 2009 10:24 AM
You need to sign in to comment on this blog
<August 2009>
SuMoTuWeThFrSa
2627282930311
2345678
9101112131415
16171819202122
23242526272829
303112345
Migrating from OCS 2007 R2 to Lync: Part 4
 Having migrated the rest of our users and legacy resources across, and start getting ready to... Read more...

Automated Script-generation with Powershell and SMO
 In the first of a series of articles on automating the process of building, modifying and copying SQL... Read more...

Seth Godin: Big in the IT Business
 Seth Godin has transformed our understanding of marketing in IT. He invented the concept of 'permission... Read more...

Using SQL Test Database Unit Testing with TeamCity Continuous Integration
 With database applications, the process of test and integration can be frustratingly slow because so... Read more...

Converting String Data to XML and XML to String Data
 We all appreciate that, in general, XML documents or fragments are held in strings as text markup. In... Read more...