Click here to monitor SSC

John Magnabosco

SQL Server Development and Data Security

Exploring HashBytes

Published Monday, December 01, 2008 6:47 AM

The word "encryption" is often used to describe the process in which plain text is converted into cipher text and later transformed back into plain text for disclosing the data. For example: Cell-Level Encryption is the term used to describe the process in which a column of data within a table is protected by hashing plain text into cipher text and then returned to plain text through the use of a key or a series of keys.

Another example is Transparent Data Encryption (TDE) is the name of a feature in SQL Server 2008 in which the plain text that is stored within a data file and transaction logs are hashed into cipher text and then reverted to plain text through a series of keys prior to its use.

For the ultra-purist, the global use of the word "encryption" in this fashion is not wholly accurate. The definition of "encryption" is the process in which the plain text is converted into cipher text. The process in which a key, or series of keys are used to convert the cipher text into plain text is "decryption".

Why does this splitting of hairs in regard to the use of the word encryption matter? The real area in which this matters is when a decision in regard to the method of data security that is to be applied to your database. The bi-directional approach, which is using encryption and decryption processes, is commonly the one that comes to mind when considering encryption. There are times when encryption in general is discarded due to the key management requirements... an unfortunate situation indeed.

Consider the mono-directional approach to encryption. This approach is the hashing of plain text into cipher text without the intention of reverting the data back to plain text. In this approach key management is not required since it is not intended to be decrypted. Searching and comparison of values are accomplished by encrypting the input with the same algorithm and then comparing the input cipher text with the stored cipher text.

In SQL Server 2005 and 2008 the HashBytes function provides us with the ability to perform the mono-directional approach to encryption. The syntax in which plain text is converted into cipher text is:

HASHBYTES('SHA1','My Plain Text')

In this example above, I chose to hash my plain text with the SHA1 algorithm. There are other algorithm options available such as: MD2, MD4, MD5, SHA and SHA1. The maximum number of bytes that accepted in the input argument and that are returned when converted into cipher text is 8,000.

Much like implementing the Cell-Level Encryption methods the data type of the field in which the cipher text is stored must be varbinary. For example, the cipher text of "My Plain Text"  would be stored as 0x6D99DDF6FE7A32547B6766E0BF88B1F50835F0FF.

There are vulnerabilities in all security efforts and by nature mono-directional encryption methods are weaker than bi-directional encryption methods. The strength of bi-directional encryption is not always necessary and their key management requirements are not always desired. The consideration of utilizing the HashBytes function in your data security efforts is something to not overlook.

by Johnm

Comments

 

Jason Haley said:

December 1, 2008 9:45 AM
 

hongjun said:

There are 2 points of caution to keep in mind:

1) Remember these hashed values are subject to a dictionary attack, so this technique is useful for obscuring data, but not sufficient for securing data.

2) Using a bigint built from a truncated hash may create an unacceptable risk of a hash collision if you are working with millions of SSN values.

Using Hashing to Obscure Sensitive Data
http://blogs.msdn.com/sqlcat/archive/2005/09/16/469257.aspx

Cheers
hongjun
December 5, 2008 9:21 PM
 

John Magnabosco said:

In nature the mineral called salt is a preservative. Prior to the age of refrigeration, mankind utilized...
December 8, 2008 5:58 AM
 

John Magnabosco said:

The touch of a human finger to a keyboard begins a journey. A journey in which the data that is born...
January 1, 2009 10:44 AM
You need to sign in to comment on this blog

About Johnm

John Magnabosco manages the Data Services Group at one of the fastest growing companies in the United States. He is also a Co-Founder of the Indianapolis Professional Association for SQL Server (IndyPASS), Co-Founder of IndyTechFest, the author of the book titled "Protecting SQL Server Data" and contributing author of "SQL Server MVP Deep Dives Volume 2".
<December 2008>
SuMoTuWeThFrSa
30123456
78910111213
14151617181920
21222324252627
28293031123
45678910
Migrating from OCS 2007 R2 to Lync: Part 4
 Having migrated the rest of our users and legacy resources across, and start getting ready to... Read more...

Automated Script-generation with Powershell and SMO
 In the first of a series of articles on automating the process of building, modifying and copying SQL... Read more...

Seth Godin: Big in the IT Business
 Seth Godin has transformed our understanding of marketing in IT. He invented the concept of 'permission... Read more...

Using SQL Test Database Unit Testing with TeamCity Continuous Integration
 With database applications, the process of test and integration can be frustratingly slow because so... Read more...

Converting String Data to XML and XML to String Data
 We all appreciate that, in general, XML documents or fragments are held in strings as text markup. In... Read more...