Tony Davis

Simple-Talk Editor
News, views and good brews

That ain't a database, it's a spreadsheet!

Published Tuesday, April 01, 2008 9:21 AM

"That ain't a database, it's a spreadsheet!"

From the Sayings of Phil Factor

 

There is a world of difference between an enterprise-level relational database and a 'repository of persistent data'. Until you've had the experience of dealing with a high-volume, high-transaction database with large amounts of data, the truth of this doesn't really hit home.

 

The constant friction between the relational and the object-oriented model is due to a misunderstanding. Something that works well in the small scale doesn't necessarily cut it at the enterprise level. Programmers tend not to appreciate the importance of indexes, stored procedures, referential integrity, constraints, or even normalisation, until they have experienced an enterprise-scale database. DBAs, in turn, find it hard to repeatedly explain the reasons. The result is often a guerrilla war between the database developers and application developers.

 

By the same token, where a database is unlikely to become large, or have challenging performance requirements, there is a lot to be said for techniques such as Object-Relational Mapping using Entity Framework or Hibernate, and the use of XML. Anything that can reduce the labour of application development should be considered with an open mind.

 

But it has to be tested.

 

From the very first stages of designing a development project, you must test your proposed data-handling architecture against the worst buffeting that a production system can experience, with the data volumes and characteristics you can expect. The interface between the data and application layer has to withstand the stress of real data volumes. It has to withstand attempts at intrusion or malicious damage. There is no substitute for this process, in terms of uncovering any issues with the design of your application.

 

This week, Red Gate launches SQL Data Generator, which we hope will help a lot with this sort of work. The tool can fill a database with enough data to expose any weaknesses in the design of a database, and the application that uses it. The team have worked hard to make sure the data generated is as close to reality as possible, in order to avoid those bugs that come from developers making assumptions.

 

So, if you're working with an enterprise scale database, and are tempted to introduce one of the "latest and greatest" enterprise technologies to your application – step forward Object-Relational Mapping, XML Columns, and Entity-Attribute-Value modelling, to name but three –then we strongly suggest you test your solution thoroughly, against millions of database rows and high transaction volumes. You may find it a sobering experience.

 

Perhaps the great divide between the object and relational cultures is there for a reason. What do you think? Add a comment to my blog, and the best contribution will receive a $50 Amazon voucher.

 

The winner of the voucher for their contribution to my previous "Not the right place" editorial is TadRichard.

 

Cheers,

 

Tony.

Comments

 

digory said:

>> There is a world of difference between an enterprise-level relational
>> database and a 'repository of persistent data'

Yep, 1 requires:
* a technical writer
* a project manager
* 4 developers
* 2 testers
* a DBA
* a team lead

This structure requires 3 months envisioning time and then 3 months to get 'traction' before it finally produces something 6 months later that the business ultimately doesn't need and won't use.

The other requires:
* a business power-user
* Excel

This structure requires about a week to produce something that includes:
* Collaboration
* Graphs
* Document-Centric
* Printable
* Email-friendly

:)
April 2, 2008 5:08 AM
 

paschott said:

I'd say there's definitely a place for both.  I've taken sub-sets of our database and filtered it down into a pivot-table on a spreadsheet for our end-users.  Add some auto-refresh of the data and they were relatively content.

Digory, you left out the next steps:
 Someone sees the spreadsheet and thinks that it's cool so everyone needs to use it.  You "upsize" it to Access and that works for the small group of people using it until the CEO sees it and wants to push it out to the entire organization, with a few "tweaks".  Of course, Access doesn't scale well at that level and there are performance issues so you (the power user) hack it up as best you can until you realize you're over your head and you then have to hire:
* a technical writer
* a project manager
* 4 developers
* 2 testers
* a DBA
* a team lead

:-D  (I've been there on both sides.  Also agree that if you start without a vision, you'll typically get what you plan for.)


As noted, I've seen cases for both. For small data sets, using a non-relational data storage type makes a lot of sense.  However, once you get to a multi-user scenario, you're probably going to need something more robust.


As for data generators, my biggest problem when using something like this in the past was the limited amount of data available.  That resulted in the same 50 last names and 50 first names in all sorts of combinations being used in the DB.  Performance worked one way in our test environment, but going to production was a completely different beast because it used real data.  I remember the SQL 2005 launch demos with the "scrubbed" data and was really hoping that MS could share some of the code they used to do that.  We don't have as much need for data generation at this point as for a good data scrubbing operation to make the data realistic, but not real.
April 2, 2008 8:27 AM
 

ASPInsiders said:

"A Weekend Read" is my weekly link list of interesting blog articles that I've seen throughout
April 6, 2008 12:43 AM
 

Steven said:

An enterprise-level relational database is just a particular type of 'repository of persistent data': one with high performance requirements. The friction between the relational and the object-oriented model is not just a misunderstanding: it is because they are different models.  This is the point of Object-Relational Mapping: to allow you to use an object oriented model in your application without compromising the relational model.  A query on a primary key is just as efficient if it is generated by an ORM tool as if it is hand written. By relieving the developer of the tedious chore of translating between the models, it leaves the developer with more time to work on the relational model, and guards against mistakes and short cuts. So ORM solutions are just as relevant to enterprise-level system as to smaller systems, if not more so.

I would have thought it a good idea to test your solution thoroughly, against millions of database rows and high transaction volumes, whether it uses ORM or is all painstakingly hand-coded
April 10, 2008 5:53 AM
You need to sign in to comment on this blog

















<April 2008>
SuMoTuWeThFrSa
303112345
6789101112
13141516171819
20212223242526
27282930123
45678910
Identity Columns
 When Nigel Rivett takes us on a tour of the apparently innocuous subject of Identity Columns in TSQL,... Read more...

The Why and How of .NET Profiling
 Amirthalingam Prasanna gives a simple and practical guide about why you need to profile your .NET... Read more...

Execution Plan Basics
 Every day, out in the various discussion boards devoted to Microsoft SQL Server, the same types of... Read more...

NET Performance Cribsheet
 Robyn and Phil tackle the topic of how to make .NET applications perform well. As usual, they try to... Read more...

SQL Code Layout and Beautification
 William Brewer takes a look at the whole topic of SQL Code layout and beautification, an important... Read more...