Click here to monitor SSC

SQL Server Development and Data Security

Encryption Alternatives

Published 10 October 2008 8:24 am

Databases often contain sensitive information and cell-level encryption is a very effective method to protect this information from those who should not have access to it. Although, encryption is not without its challenges. There is a performance cost. There also is additional administrative tasks required with its introduction into the database. That is why there are some instances where encryption is restricted, by the powers that be, from being utilized.

When the DBA or Developer finds themselves faced with the requirement of protecting sensitive data while not being able to utilize encryption they must seek alternatives to this feature. There are many alternatives available and some, when appropriately implemented, can be as effective as encryption itself. Below are a few examples of these alternatives:

Store Only The Data You Need
In this day and age storage media is relatively inexpensive. This is a good thing, but it does enable lax storage strategies. When determining the schema of your table consider the use of the columns that are being added, especially ones that contain sensitive data. Try to avoid the "We may need it someday" mind set.

For example: The Social Security Number is an identifying number provided by the Federal Government for individuals in the United States. This piece of data is often found in tables that contain other information about an individual. If this piece of data is not being used functionally by the applications that access the database its inclusion in the schema should be reconsidered and challenged.

If it is determined that the sensitive data must be stored in the table, consider only storing a portion of the data rather than the entire data. For example: If you must store the Social Security Number, storing the last four digits may suffice for its use in your database.

Masking Plain Text 
We are all familiar with the standard method of displaying the text of a password as it is being typed into a text box; all letters are replaced with an asterisk (*). Most are also familiar with the masking methods used on a credit card receipt: the last four digits of the credit card number are displayed in plain text while the remainder of the number are replaced with an asterisk.

These two examples demonstrate how sensitive data within the database is hidden at the point of disclosure from the eyes of those who should not see it. While this approach is very effective it does not necessarily suggest that the sensitive data that is stored in the database is similarly masked. According to PCI Compliance Standards, Requirement 3.4, credit card numbers (PAN) cannot be stored in a database in its entirety. Masking this information within your database is a valid means of meeting these requirements.

When implementing masking, please remember that the storage of data that is entirely masked is rather useless since it cannot be read or searched by anyone regardless of the security level. Determining the extent of masking that is to be applied to the sensitive data is a fine balance between maintaining the value of the data for authorized users while rendering it useless to those who are unauthorized for its use.

Coding Data
Utilizing numeric or alphanumeric values to represent the real values of data is often a practice that can be found in normalization efforts. Placing a value of "16" to represent the State of Indiana or "01" for the State of Delaware is a very efficient means of storing this data. It also allows for the represented text to change without having to modify all of the rows that are associated with that value.

This same practice can be utilized to protect sensitive data. Consider a financial planner who maintains records of their client’s financial history in a database. Knowing that certain aspects of their information is sensitive the financial planner might store the value of "1250" to indicate that his client has declared bankruptcy or "3427" to indicate the beneficiary of the client’s assets. Without access to the data that provides the interpretation of this code, the sensitive data remains a mystery.

Hashing
SQL Server offers the HashBytes method. This method uses a specified algorithm to turn plain text into a binary value that then can be stored in the database masking the sensitive data it represents. The specific syntax of using the HashBytes method can be found at:
http://msdn.microsoft.com/en-us/library/ms174415.aspx

There is a fine line between hashing data and encrypting data. Encrypting data involves the process turning plain text into cipher text as well as decrypting the data back into plain text when needed. Hashing converts the plain text into a binary value and is only compared with other hashed values to produce validation of plain text; thus removing the key management aspect of encryption as well as a portion of the performance affects.

Here is a great blog entry that talks about using the HashBytes method to secure sensitive data:
http://blogs.msdn.com/sqlcat/archive/2005/09/16/469257.aspx

While these options have their own benefits and vulnerabilities they do offer some creative approaches to protecting the sensitive data that has been entrusted to your organization.

6 Responses to “Encryption Alternatives”

  1. RDW2 says:

    Might it not also serve, in many senses, to segregate the sensitive data from the rest of, for instance, the Employee data so that the “public Knowledge” Employee data would be in one table and the “Secure Knowledge” Employee data would be in another. The second table cold also use the hash/encryption process in order to add more security to the data.

    Unfortunately, it is also often true that the very data that needs to be encrypted is the data that needs to be used for selecting the data from the table. For instance, when one gets right down to it, if you have 3 John S. Johnsons working for the company and the payroll clerk is trying to make sure that the data for a specific one of them is being applied to that Employee record, the SSN is very likely going to be the tool of choice for identifying the employee. However, if the SSN is encrypted proparly, then there is no point in having an index on that column unless the query to select the data is also using an encrypted value for the SSN in its seek (and, even then, there may be no point). Likewise, when someone is trying to determine whether a transaction was applied to a credit card or who the credit card is assigned to, using the credit card number may be problematic if it is encrypted.

  2. Johnm says:

    Yes, placing sensitive data in a separate database is definitely an option to consider.
    There are many options beyond the ones that were offered and worth considering depending on your unique situation and needs.

    In regard to using SSN to identify which “John S. Johnson” you want: There is really no need to store the entire SSN only for the purpose stated. It is very unlikely that there will be a “John S. Johnson” with the same last four digits of the SSN. The risk and cost of storing such data in its complete format simply for this purpose should be seriously reconsidered.

    In regard to verifying transactions for credit card numbers: The HashBytes option is perfect for this purpose. This process never reveals the actual value; but hashes the input value and compares the hashed values. It is ideal for data that does not change very often.

    Thanks for the great comment!

  3. DSquared says:

    One thing that I have struggled with in terms of DB encryption and application architecture is the time at which data should be encrypted or hashed. The SQL Server team has changed the specifics and range of options for encryption from 2000 to 2005 and now in 2008. These are all great options if you want to protect data from users that have access to the DB, or from someone gaining access to a backup they shouldn’t (we’ve all heard of a SysAdmin leaving tapes on the train or something to that affect).

    My issue is that these measures only protect data once it is inside the database. What happens to the data on the way to the database? If you are designing a truly secure environment you need to consider that while the data may be “secure” once in the DB, somebody could be watching on the way to the database. Encryption or Hashing and other security measures, in my opinion, belong as much at the client as they do in the database.

  4. Johnm says:

    Often the argument between protecting data at the UI vs. the database advocates an exclusive selection. This is unfortunate in the data security dialog since the encouragement of selecting one over the other results in a rather thin layer of protection. Utilizing a hybrid of methods in protecting sensitive data in travel (via the UI) as well as in rest (via the database) offers a robust, flexible and effective solution.

    In addition to the UI and database level of protection there is the sensitive data handling policies that should be implemented beyond the application itself. For example: Printing of reports, exporting to Excel and other programs, etc.

    The methods offered in this blog entry are intended to be just a highlight of one aspect of a much broader strategy.

    This is the type of dialog that should happen more often! :D
    Thank you for your input!

  5. SomebodyfromFLINT says:

    I think we need to take computing all the way back to the beginning and fix where we went wrong…..

    My thinking is that Assembly language is fundamentally wrong, encryption as it stands always leaves corporate with an advantage over the citizen–probably by design I would add

    My thinking is that more communications should require many concurrent affirmations rather than just having machines march to orders without a second thought…. I think that encryption is no longer a viable way to protect data, we should find a way to secure data in plain sight with multiple twists of encryption and better yet zero encryption ….It’s a difficult challenge but nobody is trying…. I think that Multiple factors of authentication can be very effective if the foundation is better built to leverage it

  6. SomebodyfromFLINT says:

    My main point is that somehow we have totally missed the true nature of the problem… There is not going to be a long term encryption solution because a massive computer will be built to crack whatever you produce…. The real problem is how to limit an all knowing albatross computer from abusing a dumb baby pc online…. Solving this problem will make the Internet a better place…. One of my ideas is that command structures could require multiple, concurrent authorizations instead of one and should supplant literal command structures that make attacks so easy….

Leave a Reply