Click here to monitor SSC

John Magnabosco

SQL Server Development and Data Security

To Bit or Not To Bit

Published Saturday, April 24, 2010 9:07 PM

'Twas a long day of troubleshooting and firefighting and now, with most of the office vacant, you face a blank scripting window to create a new table in his database. Many questions circle your mind like dirty water gurgling down the bathtub drain: "How normalized should this table be?", "Should I use an identity column?", "NVarchar or Varchar?", "Should this column be NULLABLE?", "I wonder what apple blue cheese bacon cheesecake tastes like?" Well, there are times when the mind goes it's own direction.

A Bit About Bit
At some point during your table creation efforts you will encounter the decision of whether to use the bit data type for a column. The bit data type is an integer data type that recognizes only the values of 1, 0 and NULL as valid. This data type is often utilized to store yes/no or true/false values. An example of its use would be a column called [IsGasoline] which would be intended to contain the value of 1 if the row's subject (a car) had a gasoline engine and a 0 if the subject did not have a gasoline engine.

The bit data type can even be found in some of the system tables of SQL Server. For example, the sysssispackages table in the msdb database which contains SQL Server Integration Services Package information for the packages stored in SQL Server. This table contains a column called [IsEncrypted]. A value of 1 indicates that the package has been encrypted while the value of 0 indicates that it is not.

I have learned that the most effective way to disperse the crowd that surrounds the office coffee machine is to engage into SQL Server debates. The bit data type has been one of the most reoccurring, as well as the most enjoyable, of these topics. It contains a practical side and a philosophical side.

Practical Consideration
This data type certainly has its place and is a valuable option for database design; but it is often used in situations where the answer is really not a pure true/false response. In addition, true/false values are not very informative or scalable.

Let's use the previously noted [IsGasoline] column for illustration. While on the surface it appears to be a rather simple question when evaluating a car: "Does the car have a gasoline engine?" If the person entering data is entering a row for a Jeep Liberty, the response would be a 1 since it has a gasoline engine. If the person is entering data is entering a row for a Chevrolet Volt, the response would be a 0 since it is an electric engine. What happens when a person is entering a row for the gasoline/electric hybrid Toyota Prius? Would one person's conclusion be consistent with another person's conclusion?

The argument could be made that the current intent for the database is to be used only for pure gasoline and pure electric engines; but this is where the scalability issue comes into play. With the use of a bit data type a database modification and data conversion would be required if the business decided to take on hybrid engines. Whereas, alternatively, if the int data type were used as a foreign key to a reference table containing the engine type options, the change to include the hybrid option would only require an entry into the reference table.

Philosophical Consideration
Since the bit data type is often used for true/false or yes/no data (also called Boolean) it presents a philosophical conundrum of what to do about the allowance of the NULL value. The inclusion of NULL in a true/false or yes/no response simply violates the logical principle of bivalence which states that "every proposition is either true or false". If NULL is not true, then it must be false. The mathematical laws of Boolean logic support this concept by stating that the only valid values of this scenario are 1 and 0.

There is another way to look at this conundrum: NULL is also considered to be the absence of a response. In other words, it is the equivalent to "undecided". Anyone who watches the news can tell you that polls always include an "undecided" option. This could be considered a valid option in the world of yes/no/dunno.

Through out all of these considerations I have discovered one absolute certainty: When you have found a person, or group of persons, who are willing to entertain a philosophical debate of the bit data type, you have found some true friends.

by Johnm

Comments

 

Jason Haley said:

Interesting Finds: April 25, 2010
April 25, 2010 10:13 AM
 

Federico said:

About NULL, it can have two meanings:
1. not (yet) known
2. not applicable (e.g. hair color for someone bald)

The former corresponds to the grey unchecked checkbox color in a GUI, the latter to a disabled (or invisible) unchecked checkbox. However, current SQL does not discriminate between the two meanings. If you're going to use boolean logic for a value where both NULLs may be applicable, then you'll have to use a 4-valued int, and there goes the bit type...

Most of the time, for applications NULL is not so useful, so it's silently converted into zero or "" or whatever the default data type value. Then why not define the column as not null with a default value? That would be saving lot of checks for NULL (and the not so rare crash because of converting NULL to any non-string type).

As to bit or not: in my opinion, a bit column should be the exception and not the common case. If you need both NULLs, a bit is not enough; or when multiple bits define a state, then they are better tested together, so an int is easier to use and conveys more information (say what you mean).
April 28, 2010 4:19 PM
 

Cheatsheet: 2010 04.26 ~ 04.30 | OOP - Object Oriented Programing said:

May 4, 2010 11:30 PM
You need to sign in to comment on this blog

About Johnm

John Magnabosco manages the Data Services Group at one of the fastest growing companies in the United States. He is also a Co-Founder of the Indianapolis Professional Association for SQL Server (IndyPASS), Co-Founder of IndyTechFest, the author of the book titled "Protecting SQL Server Data" and contributing author of "SQL Server MVP Deep Dives Volume 2".
Latest articles
Checking Out SQL Backup Pro 7’s New Automatic Backup Verification
 Wouldn't it be great to offload the daily chore of checking the integrity of your production... Read more...

Chuck Lathrope: DBA of the Day
 Chuck Lathrope was a finalist for the Exceptional DBA of the Year award in 2009. We contacted him to... Read more...

Backups, What Are They Good For?
 Pixar recently confessed, in an engaging video, that Toy Story 2 was almost lost due to a bad backup,... Read more...

C# Async: What is it, and how does it work?
 The biggest new feature in C#5 is Async, and its associated Await (contextual) keyword. Anybody who is... Read more...

SQL Server 2012 AlwaysOn
 SQL Server AlwaysOn provides a high-availability and Disaster-recovery solution for SQL Server 2012. It... Read more...