Click here to monitor SSC
Av rating:
Total votes: 60
Total comments: 15


Joe Celko
The DIS-Information Principle: A Splitting Headache
17 August 2010

You can easily re-factor bad DML code, but if a database design is wrong, you can do little to rescue the problem, even with expert queries. So what constitutes 'wrong RDBMS design? What are these errors that continually crop up? How can you recognise them and fix them? Joe embarks on a new series of articles by identifying a series of bad practices based on the habit of 'splitting' that which shouldn't be split.

In data modeling, we start with a schema that represents our "Universe of Discourse", which might be a business, for example. Within the schema, we have tables that model sets of entities and relationships. Inside the entity tables, we have columns that model the attributes of the entity. The columns contain scalar values. Tables that model relationships can have attributes, but they must have references to entities in the schema. All the information in the data model has to follow these rules, or an RDBMS does not work. This is known as Dr. Codd's Information Principle and it is the basis for the relational model.

It sounds pretty simple, neat and clean; but apparently it is harder than it looks. Beginners in any field will tend to make the same kind of errors in their data models and eventually the errors get a name. There is “attribute splitting”, “column overloading” and “EAV” (Entity-Attribute-value) to name a few Data Definition Language (DDL) errors. But the one thing that they have in common is confusing entities, attributes and values and modeling them wrongly.

People do not make random errors; they have a flawed pattern or mindset that leads to the errors. When you learn a new language, you tend to use the syntax and semantics of your old language. Let's start with an easy one:

Table Splitting

The classic form of table splitting is to take the values of an attribute and make a table for each value. The value is implied in the name of each table. The information principle dictates that each table should be the entire set of one kind of entity. The example that I pull up all the time in order to demonstrate the problem is taking “Personnel” and splitting it into “MalePersonnel” and “FemalePersonnel” tables. The split attribute is the 'sex code'. It is so extreme that anyone hearing the example can see the problems immediately.

What the symptoms of this problem?

If you are constantly finding that you are forced to do UNIONs in queries or IF-THEN-ELSE logic for Insert, Update and Delete operations, you could have a split table. Especially if the tables that are involved have the same structure.

Split tables evolve to be like amoebas and they keep splitting. Attributes tend to gain, rather than lose, values in their domain. The "MalePersonnel” and “FemalePersonnel” tables seem like the only two choices for a sex code split. But they are not. Corporations and other organizations are "lawful persons" and can show up here. If you hire a temp agency, like Manpower or Kelly Girls, to provide personnel, you will need a third table. If you cannot tell the gender of your new hire "Alex Mack", you need a gender assignment rule or a fourth table. This is so obviously silly, programmers will not do it. Well, most programmers; never underestimate stupidity.

Look at splits on temporal columns. You will often find a table for each year or month within a year. Why does this look okay? Because that is what we did with magnetic tapes! The old IBM magnetic tape labels used “YYDDD” format that led to some Y2K problems. If you get on SQL Newsgroups, you will find tables that have this or some variant as their names. You will also find tables split on geographical values, organizational units, etc. which can be traced back to a file systems of some kind that organized the data this way in the physical storage media (vertical file cabinets, mag tapes, etc).

This is not the same as partitioning a table. Partitioning is a physical storage management method; the table is on logical unit of work. The SQL engine will maintain the partitioned data and not allow duplicated or dropped data. You do not have to write extra code in order  to assemble the proper schema by hand.

Column-Splitting

Tables are not the only parts of a schema that can be split; so too can columns. The most obvious example of column-splitting is the use of BIT flags in SQL. Old COBOL and assembly language programmers love to do this because it looks like what they had in punch card file systems before RDBMS. A database is an integrated whole and we can determine a fact inside it with a predicate. Files do not work that way. Each file and each step in a sequential process was independent. One step in a process could only communicate with the next steps with a simple flag passed in the data. For example, when you deleted a record in a mag tape file, the record was not immediately destroyed. The front of each record would have an "is_deleted" flag that was set to tell the next program to skip this record.

Unfortunately, this model of data processing was carried forward into newer technology. Consider the BIT flag column is_male” in a table. In SQL Server, BIT is a numeric data type and it is therefore NULL-able, but ignore that problem for now. This looks nice, but then you run into your first corporate entity. Is the XYZ Corporation male or female? Well, neither. If you want to get ridiculous, you could try to assign gender to corporate bodies – Playboy Enterprises is male and Mary Kay Cosmetics is female.

The ISO sex codes are {0= unknown, 1= male, 2= female, 9= lawful persons}, they are standard and they make sense in a business environment. The BIT flag guy has to add another flag – “is-corporate” and now he needs a CASE expression to handle the (is_male, is_corporate) pair. There are (2^2) = 4 binary combinations, so what do we do? How about (0, 0) = unknown because all zeros look good; (1, 0) = male, not corporate; (0, 0) = not male (female?), not corporate; (1, 1) = male lawful person and (0, 1) = female lawful person? None of the flag pairs makes sense! Now we need to add CASE logic to test for a lawful person, then look at gender if it is not a corporate unit.

This gets worse as you add more and more flags. As of mid-2010, I am scaling up a very successful auction system that currently has to find the status of an auction by looking at almost a dozen BIT flags. For example, a bid cannot be both rejected and rescinded. The business rules are not in one place. What should have been a state-change model using (prior_status, current_status) pairs to track the status is convoluted logic hidden in tens of programs. [see: http://www.simple-talk.com/content/article.aspx?article=589].

Function Splitting

The strangest splitting disaster I have seen was done with UDFs (user defined functions) as part of a reporting system in SQL Server from data taken off of an SAP system. The raw skeleton code for the report procedures looks like this (pardon the “fn_' prefixes, but it was in the original):

PROCEDURE Total_Income_Report(..)

AS

..

SELECT (fn_Equipment_Income()

     + fn_Supplies_Income()

   ..

     + fnMisc_Income() ) AS total_income

 FROM..

 WHERE..;

END;

The list of UDF calls return one source of income per function, as implied by their names. But these functions are all the same except for one line. That skeleton is:

CREATE FUNCTION fn_<Particular>_Income()

RETURNS DECIMAL (12, 4)

AS

BEGIN

SELECT account_amt

 FROM Accounts

 WHERE acct_nbr = <particular acct nbr>;-- the magic predicate!

END;

The account numbers are all the same data type and fall in a range. The whole report could be written as one SELECT statement, something like this skeleton:

SELECT acct_nbr, SUM(account_amt) AS total_income

 FROM Accounts

 WHERE acct_nbr IN (100, 200, 300, 400, 500,..)

 GROUP BY acct_nbr;

 

It should be no surprise that the original procedure took over 5 minutes to return 48 rows, and the revised version ran almost immediately. The reason given the offender gave for this style of coding was that it made the reports easier to maintain. I am not kidding. I would say that he was trying use OO programming methods on a table and does not understand declarative programming, so he was faking them in a UDF.

Column Overloading

When I have a column in a table, it should be a single attribute measured with a single scale, which explains itself. In English, this means that you do not put your hat size and shoe size in one column and have a second column or rule that tells you which attribute or scale that first number means. You have a hat size column and a separate shoe size column, each with its own scale. This design error is a split in which I have pulled the (value, scale) pair out of a column and put its parts in at least two places.

Please note that “scalar” and “atomic” are not the same things. Scalar values are one-dimensional values on a scale of some kind. I am not going to go into scales and measurements here, but you can read about it in my books Thinking in Sets and Standards in SQL where I have a few chapters on these topics.

The term “atomic” means that the data element cannot be broken up without destroying its meaning. Scalar values are automatically atomic, which is very handy. But the reverse does not hold. The pair (longitude, latitude) is two columns, but it is only one data element (location) by this definition. I could also model a location with a single Hierarchical Triangular Mesh (HTM) code or other vendor extensions for geographical data that will fit into a single column in the table. The idea of a data element is more abstract than just a physical unit of storage or column count, like records in files. I can have a virtual data element that I compute as needed and it has no existence at all.

The most common way to overload a column is make it NULL-able and then allow for the nulls to have multiple meanings. Every data type in SQL has to be NULL-able (This was Dr. Codd's rule 3: Systematic treatment of NULL values)

note: The Hierarchical Triangular Mesh is a method to subdivide the spherical surface of the globe into triangles of nearly equal shape and size. The HTM gives us a very efficient indexing method for objects localized on the sphere.

Conclusion

Splitting parts tables and columns is only one family of RDBMS design errors. There are plenty of other disasters awaiting the beginning data modeler. Please remember this little series of articles in not about bad DML. There is a lot of bad queries in the world, but they can be replaced with good ones, but DDL design errors effect everything and the best query in the world cannot save you.

The topic of EAV (Entity Attribute Value) design both so common and is so awful it needs its own article. With painfully detailed examples.

The use of 1:1 relationships is not well understood. People want to cram all the entities in the relationship into one table.  Each part only appears once, doesn't it?

Let me surprise you with the rest of this series.



This article has been viewed 7527 times.
Joe Celko

Author profile: Joe Celko

Joe Celko is one of the most widely read of all writers about SQL, and was the winner of the DBMS Magazine Reader's Choice Award four consecutive years. He is an independent consultant living in Austin, TX. He has taught SQL in the US, UK, the Nordic countries, South America and Africa.
He served 10 years on ANSI/ISO SQL Standards Committee and contributed to the SQL-89 and SQL-92 Standards.
He has written over 800 columns in the computer trade and academic press, mostly dealing with data and databases. He is the author of eight books on SQL for Morgan-Kaufmann, including the best selling SQL FOR SMARTIES.
Joe is a well-known figure on Newsgroups and Forums, and he is famous for his his dry wit. He is also interested in Science Fiction.

Search for other articles by Joe Celko

Rate this article:   Avg rating: from a total of 60 votes.


Poor

OK

Good

Great

Must read
 
Have Your Say
Do you have an opinion on this article? Then add your comment below:
You must be logged in to post to this forum

Click here to log in.


Subject: You can't be serious
Posted by: BuggyFunBunny (view profile)
Posted on: Sunday, August 22, 2010 at 10:16 AM
Message: -- What should have been a state-change model using (prior_status, current_status) pairs to track the status is convoluted logic hidden in tens of programs.

That's outrageous. People knew enough to write order entry systems (request, order, invoice) intelligently decades ago. Why do I suspect this system was built by 20 something web kiddies?


-- The topic of EAV (Entity Attribute Value) design both so common and is so awful it needs its own article. With painfully detailed examples.

I can't wait. Did you boil the offenders in oil?

Subject: bit flags
Posted by: Anonymous (not signed in)
Posted on: Monday, August 23, 2010 at 12:24 AM
Message: I have to admit, I like the use of nullable bit-flags for sex-determination. An employee is male, female or the information is not known or is not relevant. For a temp-agency, the sex is not relevant and a null-value serves nicely there.

Subject: A splitting headache
Posted by: rosscecil (view profile)
Posted on: Monday, August 23, 2010 at 7:08 AM
Message: It's not that people don't care, it's just that they're under a lot of pressure to "get it done" and don't have the time to either learn to do it right or to do it right!

My experience in over 40 years of writing code for pay is that you need to solve the problem about three times before you realize what the first solution should have been.

I've used bit flags and other sub-optimal storage techniques. I've probably made every mistake possible. Which is why I keep reading, experimenting and learning.

I can wait for the rest of the series!

Subject: There's a reason EAV is so common
Posted by: Phileosophos (not signed in)
Posted on: Monday, August 23, 2010 at 6:39 PM
Message: To wit, because so many systems need to be able to define properties at run-time and not compile time, yet using anything but an EAV approach in the database pretty much ties your hands to compile time. I'll readily agree that EAV schemes are awful, but I've yet to see anybody suggest something that works better for such very common problems. I'll be very interested to read what you have to say about it.

Subject: Our ears is all yours
Posted by: "The DB-pragmat" (not signed in)
Posted on: Tuesday, August 24, 2010 at 11:36 AM
Message: Please elaborate more into: The use of 1:1 relationships is not well understood....
In cases where the number of different kinds of entities are limited, why not use the 1:1 relationship to extend the "base" table ?
When should other approaches come into play ?

Subject: EAV, also 1:1
Posted by: Tony Rogerson (not signed in)
Posted on: Thursday, August 26, 2010 at 12:42 AM
Message: Hi Phileosophos, this is where the NoSQL products are gaining popularity, because a database schema (as defined in SQL rather than the relational model) is fixed, it's not dynamic. Real life isn't fixed, documents have tags added, they have properties added, there are multitude of reasons why new columns (or fields) need to be added. SQL is bad at that. NoSQL products aren't.

In terms of a 1:1 - you can't easily do that in SQL Server, you need deferred contraints in order to properly achieve that in DDL, of the main stream products - Oracle has them, DB2 doesn't, we don't.

It's all well and good saying attribute splitting is a design floor in relational design theory but if that relational design theory cannot keep up with what the world needs then that legacy 25 year theory will eventually go - hence the NoSQL movement. Note, I'm a true advocate of relational theory, but SQL - only as a language that has now been established - the language itself is terrible.

Tony.

Subject: EAV is Evil
Posted by: BuggyFunBunny (view profile)
Posted on: Thursday, August 26, 2010 at 11:34 AM
Message: Tony:

These are the excuses of the weak minded NoSql crowd; I'm not speaking of you, just to be clear.

There is No Free Lunch with the Document Tag Adding paradigm. A database is not a Document, nor a collection of Documents. The NoSql crowd blissfully ignore the distinction. (I've had to deal with xml definitions which *attempt* relational semantics by stuffing the entirety of related data into each and every "document"; phooey.)

Adding Documents or Tags requires re-writing each piece of code which processes a Document which has been modified. IOW, coders have to write an ACID accurate infrastructure around their Documents, if they care about consistent data, many don't. And each group goes off and does that; it's a Permanent Employment Paradigm for them. It's woefully inefficient, at a proper level of observation. NoSql folks either never took a DB course, or weren't smart enough to to earn a B grade; there is some maths to it, after all. The xml crowd owe their heritage to LAWYERS, for crying out loud; verbose and empty of meaning.

Intelligent database design/coding is transparent to schema add modification; if done with brain engaged. Schema deletions are more intrusive, but less so if one isolates data modification to database processes, rather than client code. It's the loss of client code bloat which has motivated the anti-RDBMS crowd.

Relational databases, even sql ones, are perfectly capable of serial modification; the Rails crowd, to name one, built an infrastructure to do that. Others have too.

I've always put this question to the anti-RDBMS crowds: demonstrate that your alternative is superior in data consistency, data parsimony, and code parsimony. Never gotten an answer.

Subject: There is also database spliting...
Posted by: Anonymous (not signed in)
Posted on: Friday, August 27, 2010 at 10:40 AM
Message: What to say about 'database splitting', which we often see in form of OrderEntry_2007, OrderEntry_2008 as databases (not data marts, real transactional databases)?. The reason is not physical partitioning, no, it's because 'not having the same columns in some tables each year'. It results in new set of web applications each year. tell me about job security :-)

Also, a nice example of column overloading I see in Canada every day. Being bilingual country, everything must be in English and French. So many tables have attribute called Lang. One would expect to see two values, 'E' and 'F'. Nope, I have seen tables with 3 values: 'E','F' and 'B'. 'B' stands for 'both E and F'. And people don't understand what is wrong with that....

It's good that I have only few more years before retirement...

Subject: Multirelations and MVA's
Posted by: Tony Rogerson (not signed in)
Posted on: Saturday, August 28, 2010 at 12:18 AM
Message: Hi BuggyFunBunny,

Good points.

The real problem here I see is SQL itself, people design for SQL rather than design for the relational model. If you have a look at what Hugh Darwen and Chris Date are doing with "Multirelations" and "Multivalued Attributes" you will see that the relational model can deal with the design issues a lot of the NoSQL bitch about.

But Codd rules talk about Durability and stuff and that's where it all goes pair shaped, so for me the relational model itself is sound, Codd's rules are for a relational database management system but why not apply relational theory without ACID - best of both worlds then, we are no longer constrained by the legacy model most mainstream databases use - logs; there is a really good paper by Stonebreaker on this.

Tony.

Subject: Stonebraker?
Posted by: BuggyFunBunny (view profile)
Posted on: Saturday, August 28, 2010 at 12:56 PM
Message: May be not widely, but certainly to an extent, is viewed (by me, of course) as apostate; seeking to get rich with whatever Suit of New Clothes he can devise. While Ingress/Postgres were groundbreaking, both his column store and volt are less than convincing.

And, alas, when I read Date's current edition's "Dropping Acid" section, I reached the same conclusion about him. In my mind, the relational model *is* ACID; rules 0 to 12 exist to enforce it in a concrete way. The NoSQL/etc. folks are merely seeking cover for their desire to dispense with a transaction manager external to their application code, and to go back to the COBOL way (pre-CICS, at that) of all data control in siloed applications.

It was that, and the failure of IMS to improve matters, which motivated Codd to devise the RM. What he understood was that the RM, to be implemented, needed not just the structures called relations, but a transaction manager integral to the datastore. IMS and IDMS had the manager, but not a flexible datastore. It is the flexibility of the RM that gets subverted by the NoSql/xml/etc. crowd, since their datastores amount to IMS/hierarchical implementations. Such datastores require more or less extensive reprogramming every time the structure is modified (a boon to coders, but no one else), and are efficient only for the access path embedded in the hierarchy. This last is the rat in the woodpile its proponents seek to keep hidden. And it is the rat Dr. Codd sought to march out of IT.

Subject: A puzzle about Column overloading vs Column Splitting
Posted by: Eric Fung (view profile)
Posted on: Wednesday, September 08, 2010 at 1:19 AM
Message: I confess, my background is a COBOL programmer before starting on RDBMS platform. So it may affect my mindset. However regards the concept "Column Overloading" on the article. I did puzzle for such a long time for below situation.

While working on a project involve inventory, my Analysis found the "inventory movement" (i.e. inventory level increment / decrement) may be cause due to 5 cause, Purchase Order, Sales Order, Return of Purchase, Return of Sales, Inventory Recount. User would like to know from Month to Month What cause the inventory level change. (e.g. 2 times of Sales of an item + 3 times of purchase + 1 return of purchase + 1 inventory recount resulted in inventory level increase by 110 items). My COBOL background will suggest me to design the table of invetory movement, with 2 colums - "Source of change" which store any one of the 5 possible cauases stated above, and another column being *** "Reference no.", which store the source document reference no.

I know that the column "Reference no." did violate Relational model concept, its contents come from more than one domains. The 2 "paired column" is what I think being described on "Column Overloading" section. However, if I change the style to use one colume for each source "Purchase Order No.", "Sales Order No.", "Purcahse Return No." ..... it will end up likely described as "Column-Splitting" in the article.

I would really hope anyone to give me hints on how would they approach such a common situation.

Subject: Flaws
Posted by: Hugo Kornelis (view profile)
Posted on: Wednesday, September 08, 2010 at 2:40 AM
Message: In general, the article is good and the warnings are fair. But there are several major flaws, that distract from the important stuff. And several of them boil down to the problem that data modelling is far more complicated than Joe makes it look here. What is correct or not often depends on the situation; there is no single correct answer.

TABLE SPLITTING: This is indeed a much too common problem. But the example Joe uses (MalePersonnel / FemalePersonnel) is flawed. In many situations, attributes need to be stored that are applicable to males or females only. Like beard length (male) of number of pregnancies (females). They can be stored as nullable columns in the generic Personnel table, but they can also be stored in subtables for male and female personnel. (And for the anti-NULL crowd, this is in fact the only option).
The examples with regions or dates in the table name are much better - though these are often not a result of bad design, but of missing support for table partitioning in previous versions of SQL Server.

COLUMN SPLITTING: Joe makes it sound as if column splitting and using bt flags are synonyms. This is not true. I've seen designs where the date was stored as three integers (day, month, year) instead of a single datetime column - column splitting wihtout bit flags. I have also seen perfectly designed systems that used bit columns in an appropriate manner - for unlike Joe, who is on a Holy Crusade against bit columns, I do not see this data type as an error in itself.

That becomes painfully clear when Joe explicitly ignores the nullability of the bit column is_male. Of course he does - he has to, for the example he uses to show the alleged absurdity of this column falls flat on its face when you don't exclude the NULL option. The is_male column is 1 for males, 0 for females, and NULL in all cases where gender is not applicable, not known, or for whatever other reason not available for being stored in the database. (But note that I do not endorse this design; the ISO sex code is indeed a much better way to store gender).

The "is_deleted" column is another example that could be used without problems, for instance when deleted data has to be kept for legal purposes. It is not the only option, of course. One could also move the deleted data to a seperate "Archive" data (though that reeks of table splitting!). And when it is important to know WHEN the data was deleted, then using a nullable "deleted_date" column is sufficient and no "is_deleted" bit column is required. But in other cases, "is_deleted" is a fair solution. (And unlike the is_male column, the is_deleted should be declared NOT NULL).

FUNCTION SPLITTING: Joe should have left this for another article, on bad DML. If there is bad DDL involved at all (which I doubt), then surely the query rewrite Joe implemented doesn't solve it.

COLUMN OVERLOADING: The example here, again, depends on the business requirement. An online clothes store has different requirements than a tailor. An online store wants to display an article and a list of available sizes. When I order, I pick one of the available sizes. This will be a hat size for a hat, a shoe size for a shoe, a shirt size for a shirt, etc. If I were to design a database for this online store, I would never dream of using seperate columns for hat size, shoe size, and shirt size. A design like that would require schema changes whenever a new line of articles is added to the catalogue. Socks and bras can not be sold, until a developer changes the schema and the frontend code to add sock size and bra size. The name of this design error is "Attribute splitting".
For Joe's tailor, I would indeed design a table with seperate columns for shoe size, waist size, length, chest size, and head (not hat!) size.

And finally (whew!) a comment on one of the last sentences in the article: "The most common way to overload a column is make it NULL-able and then allow for the nulls to have multiple meanings".
Why did you not write: "The most common way to overload a column is allow for some value to have multiple meanings"? This problem is not limited to nulls; I've also seen designs where non-null values have multiple meanings. Like a single char(1) column that is used to store gender for natural persons, and orgnaization form for legal persons.

Subject: A puzzle about Column overloading vs Column Splitting
Posted by: Hugo Kornelis (view profile)
Posted on: Wednesday, September 08, 2010 at 2:45 AM
Message: Hi Eric,

If I understand your requirements correctly, each inventory movement can be caused by one of five possible causes, and each inventory movement is related to one of five possible document types.

If these document types are completely distinct, I'd do without the "source of change" column, but include 5 columns for the referenced document. Add a CHECK constraint that exactly one of them is not NULL and all others are NULL. The source of change can be deducted by checking which of the five documents references is populated.

If the document types are partly similar, I'd consider using a single table for the generic information of all documents, with subtables for the attributes that are specific to a type of document. Now the inventory movement can reference the "document" supertable. That table should also include an attribute for the type of document, so again there is no need for the source of change in the inventory movement table.

I hope this helps!

Subject: Less obvious examples
Posted by: terrysycamore (view profile)
Posted on: Wednesday, September 08, 2010 at 8:52 AM
Message: Yes, the article is good and points out some real flaws in database design. It uses examples that to me are obvious. But, looking at some of the comments, maybe they are not quite so obvious.

But the decision to split or combine can get more subtle. A slightly less obvious example than MalePersonnel and FemalePersonnel is Invoices and Payments.

These are frequently represented in separate tables, but there are significant advantages in having a single table: FinancialTransactions. You can then include Adjustments and Credits and calculate a balance from querying a single table.

Of course, there are attributes that are unique to one or the other of these transaction types. These can be handled with nullables columns, but if there are very many of them, a separate table with a 1:1 relationship to the main table does the job. Perhaps this is where you were going...

Subject: Thanks Hugo !
Posted by: Eric Fung (view profile)
Posted on: Friday, September 10, 2010 at 8:23 PM
Message: Hi Hugo Kornelis,

Would like to thank you for willing to share your point of view with me !

 










Phil Factor
Automated Script-generation with Powershell and SMO
 In the first of a series of articles on automating the process of building, modifying and copying SQL Server... Read more...



 View the blog
Converting String Data to XML and XML to String Data
 We all appreciate that, in general, XML documents or fragments are held in strings as text markup. In... Read more...

SQL Source Control: The Development Story
 Often, there is a huge difference between software being easy to use, and easy to develop. When your... Read more...

How to Import Data from HTML pages
 It turns out that there are plenty of ways to get data into SQL Server from websites, whether the data... Read more...

SQL Scripts Manager: An Appreciation
 SQL Scripts Manager is Simple-Talk's present to its readers. William Brewer was an enthusiastic... Read more...

Hosted Team Foundation Server 2010 Review
 Team Foundation Server (TFS) has expanded its remit to support the whole software development process,... Read more...

Beginning SQL Server 2005 Reporting Services Part 1
 Steve Joubert begins an in-depth tour of SQL Server 2005 Reporting Services with a step-by-step guide... Read more...

Ten Common Database Design Mistakes
 Database design and implementation is the cornerstone of any data centric project (read 99.9% of... Read more...

Reading and Writing Files in SQL Server using T-SQL
 SQL Server provides several "standard" techniques by which to read and write to files but, just... Read more...

Beginning SQL Server 2005 Reporting Services Part 2
 Continuing his in-depth tour of SQL Server 2005 Reporting Services, Steve Joubert demonstrates the most... Read more...

Creating CSV Files Using BCP and Stored Procedures
 Nigel Rivett demonstrates some core techniques for extracting SQL Server data into CSV files, focussing... Read more...

Over 400,000 Microsoft professionals subscribe to the Simple-Talk technical journal. Join today, it's fast, simple, free and secure.

Join Simple Talk