Phil Factor's Phrenetic Phoughts

Simple-Talk columnist
The wilder shores of Transact SQL    Phil on Twitter   Phil on SQL Server Central  Phil on BOS

When the fever is over, and one's work is done.

Published Thursday, May 29, 2008 10:17 AM

There comes a time in the life of anyone working with databases when they are struck with a peculiar mental aberration. For want of a better term, it is usually called ‘One True Lookup syndrome’ or OTLS. It is in the same league as the rather more severe affliction ‘EAV disease’ (Entity Attribute Value). It is like measles or Mumps in that it afflicts the sufferer just once and he, or she, is thereafter immune. The memory of the suffering and delirium is retained, however, and the feelings of embarrassment.

The sufferer is easy to recognise. He shakes hid head in wonder, saying ‘This is like such a cool idea, how come nobody has thought of it before, It is like wow! I mean it is so neat’. A classic symptom is that they will sidle up to you to ask how to patent software ideas. We grey-muzzles will shake our heads sadly, with a pitying smile, and just mutter solemnly ‘One true lookup table’.

OTLT syndrome comes about usually when one is forced to create a number of simple lookup tables which contain a code and a name. One gets the sudden flash of inspiration that one could roll them all into one lookup table, with a unique identifier for each row, and a type field, thereby saving the chore of creating and maintaining a number of tables. EAV disease, like OTLT syndrome both tend to be the result of pressure from programmers to push their OO designs back into the relational database, like a bathyscaphe in an alien environment . There is loose talk of ‘persisting object data in a data store’, as though one was storing apples in a shed. It also comes about when a development team attempts an ‘Agile’ development before they’ve fully understood the data model.

I remember the first symptoms well. You are under stress: in my case it was a systems analyst whose analysis skills were at about the same level as a dead sheep’s. The database was for an internet system for car dealerships, selling second-hand cars. It was supposed to value your stock, and to enable groups of dealers to supply a punter with the exact car he was looking for. The Systems analyst was making very heavy weather of understanding the business processes of the Dealers. He kept changing his mind, yet wanted a system that he could demonstrate to the dealers. It suddenly occurred to me that, instead of having a normalised and rational data model, one could store the ‘soft’ attributes of each object in a single table, the ‘Entity Attribute Value table’. One could then accommodate a changing schema without changing the structure of any table.

It all seems so easy. You have an ‘object’ table that stores the metadata of each object stored. If you are severely stricken, you make this hierarchical, so that you can implement inheritance. A swift lookup will tell you what attributes there are for any particular object and what their data type is. If you are heavily into Dynamic SQL, then you’d add rules, constraints and defaults.

You then have one instance table that is shared by all objects. The value is stored as a string, just as it is in SQLite. When a new object is created, it receives a unique identifier, and its attributes are written into this table

With this system, you can make changes to your data structures ‘on the fly’ without any apparent need for a-priori database design. You then will wish to record the relationships between your objects. I popped in a relationship table. One could then do fairly complex queries with remarkable ease. In my case, it would be ‘how many different models of car were manufactured by Chrysler in 2003’? , what is the current value of all the cars on the various forecourts?’ or ‘What Ford estate cars are currently for sale on any of the forecourts.’

I loaded the system with all the current data from ‘Glass’s guide’, so that I had all the necessary information for all the cars that had been in volume production for the past twenty years. Everything worked fine. I created test data by inventing several subscribing dealerships, and filled their imaginary forecourt with many imaginary cars. I then created an application interface based on stored procedures. I felt really pleased.

The application programmer who’d been assigned the task of creating the Internet site hated the interface. He was used to direct SQL access to tables via Cold Fusion. His first step was to create a scrollable listbox containing the details of all the cars on the forecourts. At first, he claimed that Cold Fusion didn’t support stored procedures: then he decided that my stored procedure wouldn’t work. After I’d tired of trying to explain how to go about it, I created a view for each object, created dynamically from the Object table every time a change was made. He was much happier then but decided he wanted to update the ‘Table’. The view wasn’t updateable. I wrote a stored procedure and showed him how to use it. He began to feel resentful that his programming knowledge had been ‘shown up’ and began to drop hints around the office that the database was no good.

For the Systems Analyst, who was struggling to explain the slow progress he’d made in coming up with the business and process model, the sudden rumour of problems with the database came as a sudden unexpected lifebelt. When pestered by the project manager, he put on his best ‘Mr Sincerity’ face and reported that it would have all been finished had it not been for the confusion caused by Phil Factor’s wild and wacky database design.

Things were getting difficult. A string of anxious managers demanded that I explain the database design to them. As they had no grounding in either object methodology or relational databases, this was always going to be a doomed mission.

It must have been around the time that I devised a meta-language to describe the objects and their relationships, that I had a sneaking suspicion that all I’d done was to write a database system in SQL Server. My tables were beginning to look eerily like the system tables in SQL Server.

Another problem loomed. Checks, constraints and rules provide the bedrock of maintaining the integrity of data. It is really tempting fate to say that, if there is an interface based on stored procedures, then bad data never gets into the system. One law of Relational Databases is that, if you put in all the necessary rules, checks and constraints, you never see bad data, and if you don’t, then it somehow insinuates itself in by some supernatural means. If you have mixed entities in one table, then the business of implementing these checks becomes difficult.

I was just reaching the point of crisis with the database, when I was facing the decision point of replacing my beautiful, elegant, complex model with something more conventional, that my guardian angel caused a buyer to appear who wished to purchase the system. My database had served its purpose, which was to provide a slick demonstration of a system before the hard work of understanding the real business processes had been done. The vision opened the purse-strings, and I bade farewell to my splendid database system. I did not pine for it, since the fever of OTLT and EAV had passed.

Comments

 

marmot4 said:

I love a good "I sucked but then I learned" story.
May 29, 2008 4:02 PM
 

Phil Factor said:

Blimey. Is that what it is?  The moral of the story could be that this sort of database design makes it easier to write 'smoke and mirrors' demos. The demonstration version of the system looked great to anyone who wasn't actually a car dealer, and we found a buyer for the system in direct consequence.
I've since used a similar, but simpler,  EAV device for another system for exactly the same reason- insufficient prior analysis of the system, but purely for development.  I  have always been careful to swap it out just as soon as the requirements firmed-up. I've never been near a OTLT since.
May 30, 2008 4:35 AM
 

marmot4 said:

Ah, I definitely came away with the wrong moral.

What I took away from it was, "I never should have done this and you shouldn't either."  But what you wanted me to take was more, "In this circumstance this was a good solution, but not for the long term."
May 30, 2008 9:59 AM
 

Jerome said:

I reckon the moral is 'there ain't no silver bullet'. I agree with Phil that trying to defeat the natural style of a relational database is fine for occasional use, but you build up trouble in the long term. EAV techniques are the curse of those who have to maintain databases.
May 30, 2008 10:51 AM
 

TomC said:

I don't think there is ever a good use for 'One True Lookup' tables. EAV tables are used to avoid very wide, sparsely populated tables such as you get in the pharmaceutical industry. SQL 2008 has its own solution for this.
I still can't see the advantage of EAV tables. how do you set defaults, check for nullability,   check for valid data? -and then there's relational integrity....
On this site, you have http://www.simple-talk.com/opinion/opinion-pieces/bad-carma/ which is a good moral tale of what happens when you get obsessed by a particular way of doing things with databases. What I like about Phil's story is that there is no clear moral!
May 30, 2008 12:30 PM
 

mcair said:

"EAV tables are used to avoid very wide, sparsely populated tables such as you get in the pharmaceutical industry. SQL 2008 has its own solution for this. "

Please elaborate - I haven't seen anything in SQL2008 that addresses this problem. Thanks
June 3, 2008 7:29 AM
 

thomas_p said:

SQL Server 2008 has 'sparse columns' which are intended to address this problem.
June 3, 2008 8:34 AM
 

Anand said:

I have two entities in the same table (Customers & Vendors), but they have similar attributes and checks. Can I keep them at they are, it looked cool when i modeled it.

Err.. I better change it immediately.
June 3, 2008 9:58 AM
 

Anand said:

Or should I not :P
June 3, 2008 10:04 AM
 

Phil Factor said:

Re: Anand

Customers and vendors both have addresses. This is probably why you.ve put them in the same table. Bad reason- as they are probably different entities. Vendors don't normally have a gender, for a start. All you need to do is to have an address table  and than you can cheefully keep customers and vandors separate. Why bother? They will probably have different rules, checks and constraints. The business relationship is probably different

Three tables instead of one? Worried? just pop a million customers into  both designs (SQL Data Generator?) and measure and measure the difference. No, I couldn't either.
June 4, 2008 7:29 AM
 

callcopse said:

TomC:
I once created a system intended for multiple vendors of a particular product to hawk their wares, including elements of CRM etc etc. Each vendor however used different attributes to describe their (expensive) products, and customer were matched personally to those attibutes through whatever elaborate process.

In this case EAV was a pretty useful pattern, and I think the right one. If you are in control of all interaction with the database then you can put checks into place through a logic tier or similar, and relational integrity is not really an issue. It can be a tad messy, as I discovered, but there was no other way to go and it did all work beautifully.
June 4, 2008 11:13 AM
 

Anand said:

Thanks Master Phil
June 4, 2008 12:00 PM
 

DavidBSQL said:

I think the company that I am working for are the ones who bought this. : )
June 25, 2008 2:08 PM
 

orcus said:

OTLT and EAV can fester if left from demo system into a production volume scenario.  
If the app dev doesn't wear a volume hat they wont be worried until it been collecting data for some time.

The argument of convincing the client to actually PAY for a db design (and codebase refactor) after they THINK that it has all ready been done can be a headbangingly concrete example of the impact.  

July 24, 2008 8:05 PM
 

piers7 said:

" I had a sneaking suspicion that all I’d done was to write a database system in SQL Server"

Been there. Done that. My thoughts exactly. In retrospect it seemed so obvious...
July 28, 2008 7:11 AM
 

patrick.eric said:

EAV and OLTP moral might be more than just assistance with smoke and mirrors.  It allows rapid prototyping, and if used well, enables business owners to refine their process model to better leverage technology.  I find many business owners need to "touch" a system to get an intuitive grasp of what technology might do for their business.

Perhaps they should not need such assistance, but what's the use of pointing that out?
July 28, 2008 8:05 AM
 

kraxmo said:

EAV requires managed (read: enforced control) of the DML interface and data content reviews; otherwise, SQL access code maintenance will become unmanageable, data will bloat more than with the non-EAV design and system performance will tank.

There is no good reason to use an OTLT; it is just a problem waiting to happen. Values become obsolete over time(or worse, used for different purposes than intended), detecting what code uses the generic lookup values is impossible and data redundancy occurs due to either lack of checking for usable existing values, fear of reusage crashing something unknown or the "I want my lookup code to be meaningful" syndrome.
August 20, 2008 1:20 PM
You need to sign in to comment on this blog


















<May 2008>
SuMoTuWeThFrSa
27282930123
45678910
11121314151617
18192021222324
25262728293031
1234567
Finding Stuff in SQL Server Database DDL
 You'd have thought that nothing would be easier than using SQL Server Management Studio (SSMS) for... Read more...

Mission Critical: SQL Server 2008 Performance Tuning Task List
 In which Buck Woody imagines how the US military would have tackled DBA checklists for... Read more...

Simple Query tuning with STATISTICS IO and Execution plans
 A great deal can be gleaned from the use of the STATISTICS IO and the execution plan, when you are... Read more...

Switching rows and columns in SQL
 When they use SQL Server, one the commoner questions that Ms Access programmers ask is 'Where's the... Read more...

Writing Efficient SQL: Set-Based Speed Phreakery
 Phil Factor's SQL Speed Phreak challenge is an event where coders battle to produce the fastest code to... Read more...