|
|
SQL Server Development and Data Security
-
Posted Wednesday, July 01, 2009 10:24 AM |
In the efforts to protect sensitive data there are some considerations that expand beyond its storage within database; more specifically, the backup files that are generated through the database backup process. Backup files are often kept on devices or media that are removed from the database server and stored in a separate location. If not properly secured these devices can be stealthily snatched away, exposing the backup files contained within. If you were to open a backup file in a program like Notepad, your first impression is that the file contains a bunch of garbley-gunk resulting in the false perception that any plain text sensitive data in the backup file cannot be disclosed by simply scanning the raw backup file. In reality, the data in the database that is stored in plain text is not modified and appears in the backup files as plain text. To see this for yourself, obtain a plain text value that has not been encrypted or hashed in your database, such as a credit card number or Social Security Number. Open your most recent backup file in an editor like Notepad, Wordpad or Textpad and perform a search for that value. You will discover that the sensitive data that is in plain text within the database also is in plain text in the backup file. Microsoft SQL Server 2008, Enterprise Edition, introduced Transparent Data Encryption (TDE) as a new feature that was designed to address this issue. This feature encrypts the physical files of the database. This includes any backup files that are generated through the native backup functionality. The Database Encryption Key (DEK) is a key that was introduced to the encryption key hierarchy in support of the TDE feature. The DEK encrypts the physical files using one of the following algorithms: - Advanced Encryption Standard (AES) using key length options of 128, 192 or 256 bit.
- Triple DES 3 Key which is an expanded key length version of Triple DES using a 192 bit key.
For a more in depth exploration of TDE and backup/recovery, check out my blog entry with the title "TDE: Under The Hood With Backup". For those who do not have Microsoft SQL Server 2008, Enterprise Edition, or do not wish to implement TDE, the need to protect your database backup files remains. There are many third-party backup products available on the market that provide encryption to the backup files that they generate. Three popular options are: LiteSpeed for SQL Server (Quest Software: Price not published) This product encrypts backup files using: - Advanced Encryption Standard (AES) which uses a 256 bit key.
The encryption feature of this product has been available for many versions. The current version of this product is 5.1 which added features such as the ability to define when to perform a differential or full backup, support for SQL Server 2000/2005 Standard Edition, native log shipping conversion and more. For more details on the latest release check out Quest Software's release information page. SQL Safe Backup (Idera: $1,195 - estimated) This product offers the user a choice of the algorithm used to protect the backup files. The available options are: - Data Encryption Standard (DES) which uses a 64 bit key. (not recommended)
- Triple DES which uses a 168 bit key.
- RC2 which uses a 128 bit key.
- Advanced Encryption Standard (AES) which uses a 256 bit key.
The AES 256 bit key encryption option is new with its current version, 5.0, which became available on June 29, 2009. This version also provides log shipping as new functionality. For more details on the latest version of this product, check out Idera's "What's New" page. SQL Backup Pro (Red-Gate Software: $795 - estimated) This product encrypts backup files using: - Advanced Encryption Standard (AES) algorithm with an option of 128 or 256 bit key length.
The encryption features of this product have existed since its initial release. Version 4 introduced the 256 bit key option. At the time of this blog entry the current version is 5.4. Version 6.0 is slated for release this Summer. This new release will include features such as handling of network outages during backup, "self-healing" log shipping, advanced compression algorithm and much more. For more details check out Red-Gate's fact sheet for the pre-release of version 6.0.
|
-
Posted Sunday, June 21, 2009 8:41 PM |
Creating a symmetric key for the use of protecting other keys or directly encrypting sensitive data is accomplished through the execution of the CREATE SYMMETRIC KEY command. One of the arguments to this command is WITH ALGORITHM which provides the means to select the algorithm used to generate the key. There are many options available for symmetric keys. The strength of these algorithms is directly affected by the resulting key length. The strongest of the options are the AES (Advanced Encryption Standard) algorithms which return 128, 192 or 256 bit keys. The RC4 algorithm returns a variable key length between 40 - 256 bits and the RC4_128 algorithm returns a key length of 128 bits. Despite these algorithms returning key lengths that might be considered strong their processing contains issues that result in vulnerable keys. According to Microsoft the specifics to the weaknesses of RC4 and RC4_128 are: "Repeated use of the same RC4 or RC4_128 KEY_GUID on different blocks of data will result in the same RC4 key because SQL Server does not provide a salt automatically. Using the same RC4 key repeatedly is a well-known error that will result in very weak encryption." (Choosing An Encryption Algorithm: SQL Server Books Online) In addition, the DESX option which returns a 192 bit key, has a simple problem of being ".incorrectly named." (Choosing An Encryption Algorithm: SQL Server Books Online) The incorrect name comes from the fact that when you select the DESX option, the creation of the key is actually using the TRIPLE_DES_3KEY algorithm, which is another available option. Based upon these weaknesses in the RC4 and RC4_128 algorithm options and the misnomer of DESX these will not be available in future releases of Microsoft SQL Server. Please be aware of the depreciation of these options when creating new or replacing existing symmetric keys.
|
-
Posted Thursday, June 11, 2009 5:51 AM |
I was recently strolling through some of my old blog entries from my pre-Simple-Talk days for inspiration. I ran across an entry on the ISNUMERIC function that I thought that worth re-posting. Enjoy! SQL Server 2005 Books Online describes the ISNUMERIC function as: "ISNUMERIC returns 1 when the input expression evaluates to a valid integer, floating point number, money or decimal type; otherwise it returns 0. A return value of 1 indicates that expression can be converted to at least one of the numeric types." To illustrate this, the following statement will return the value of 1 (true): SELECT ISNUMERIC('486') While the following statement will return the value of 0 (false): SELECT ISNUMERIC('ABC') There are some additional characters that return a positive response to the function that are not regularly considered numeric such as the dollar sign ("$"), the comma (",") and the period ("."). This occurs since they can be converted into other numeric data types. For example, the following statement will return the value of 1 (true) because it can be converted into a money data type: SELECT ISNUMERIC('$4,860.00') A strange thing happens when you introduce combinations of numbers and letters. Majority of the letters in the alphabet will return a 0 (false) when examined with the ISNUMERIC function; but when the letter "D" and "E" are introduced a 1 (true) is returned. On the surface, this occurrence may appear to be a bug, or inaccuracy of the function; but it is not. Let's begin to explain this odd behavior by introducing scientific notation. Scientific notation is a method often used by Engineers, Mathematicians and Scientists to represent very large numbers in a more convenient and easier to read format. Without going into the detailed mathematician-speak regarding this notation please allow the following example to illustrate how scientific notation works: The number of 4,342,000,000,000,000,000,000 when scribed using scientific notation would appear as 4.342 x1021. Working with scientific notation can be challenging for us Developers who must use a standard QWERTY keyboard to code our applications. Not to mention those who must document such numeric values in spreadsheets and scientific articles; thus, exponential notation was introduced to save the day. A very simplified description of exponential notation is the representation of x10n with the letter of "E" and followed by the exponent and a character to designate positive and negative numbers. For example: 4.342 x1021 would be represented as 4.342E+21. When programming those mega-numbers in SQL Server, and many other languages, using exponential notation returns a 32 bit signed single precision floating-point data type (also known simply as "single"). This means that if you exceed the range of -3.40E +38 through 3.40E +38, either truncation or error in conversion will occur. Try the following to illustrate the limitations of the single data type: SELECT CONVERT(FLOAT(24),'3.40E+39') The conversion to FLOAT(24), which is also known as a "real" number as well as a single data type, is too small to contain -3.40E +39 and returns the following error: Arithmetic overflow error converting expression to data type real. Replacing the letter "E" with "D" will force the scientific notation to return a 64 bit signed double-precision floating-point data type (also known affectionately as "double"). The use of "D" will extend the range returned from -1.79E +308 through 1.79E +308. This is very handy when calculating the expanse of the universe, counting atoms or keeping track of the National debt. Try the following to illustrate the expanse of the double data type: SELECT CONVERT(FLOAT(53),'1.79D+308') The conversion to FLOAT(53), which is a double data type, returns the exponential notation that is placed in the argument without error illustrating that the data type is large enough to store the data. Based upon the definition of scientific notation and exponential notation you might expect the following statement to return 0 (false) because it does not follow the aspect that states that the coefficient (the number that precedes the "E") should be less than 10: SELECT ISNUMERIC('1001D1') Since the value can still be converted into a floating point number, the ISNUMERIC function will return 1 (true). This presents a challenge if the value "1001D1" is a serial number instead of scientific notation. If the intent is to evaluate this string to determine whether it includes a nonnumeric character, the consideration of the PATINDEX function would be recommended. With the introduction of the regular expression "[^0-9]" as an argument to PATINDEX, the location of the non-numeric character is returned. If the value of 0 is returned, the string does not include any non-numeric values. An example of this function's syntax would be: PATINDEX('%[^0-9]%', '1001D1') Keeping this information in mind in regard to the tricky ISNUMERIC function and how to utilize the PATINDEX function as an alternate will save you so much time that you might just need to use scientific notation to represent the minutes available in your day for other activities such as sleep.
|
-
Posted Thursday, May 28, 2009 4:03 PM |
I have been subscribing to the OSF Data Loss Feed for well over a year. This feed provides me with brief blasts of information about data loss events that are reported around the world. These data loss events consist of sensitive data that is stolen, lost or carelessly disclosed. This has been a very interesting and eye-opening feed to receive. One of the ways that sensitive data has a habit of being lost is through stolen laptops. The Open Security Foundation's Data Loss DB site reveals that stolen laptops are 22% of the reported data loss events since OSF has been tracking them. Below are the reported data loss events directly related to stolen laptops in the months of April and May of 2009: April 2, 2009: Stolen laptop with personal data of 33,000 children. April 8, 2009: Stolen laptop with 1,892 Social Security Numbers and other personal data. April 10, 2009: Stolen laptop containing financial account numbers. April 13, 2009: Stolen laptop with personal data of potentially 14,380 patients. April 23, 2009: Stolen laptop with personal data of 1,392 patients. April 23, 2009: Stolen laptop with 1,000,000 Social Security Numbers. April 30, 2009: Stolen laptop with 225,000 Social Security Numbers and other personal data. May 5, 2009: Stolen laptop with 1,000 Social Security Numbers. May 7, 2009: Stolen laptop with personal data of 2,000 patients. May 13, 2009: Stolen laptop with 47,000 Social Security Numbers and other personal data. May 28, 2009: Stolen laptop with personal data of 109,000 members. It is interesting that majority of these incidents relied solely on the laptops being password protected to secure the data or made the assumption that the thief was unaware of the data that was contained within them. It certainly is not a comforting thought for the 1,434,664 + people that were affected by these incidents. A study by the Ponemon Institute, a research organization that focuses on privacy and information security, indicated that the average cost of a lost laptop is $49,246, with only $1,582 of that figure being the replacement cost of the hardware. These costs include: Detection, investigation, intellectual property loss, productivity loss, legal costs and regulatory costs. An average of $39,297 is directly related to the costs of the data breach itself. The portable nature of the laptop is its appeal and vulnerability. It is quite tempting to save client data on a laptop so that it may be accessed when you are away from the office. Password protecting files is better than leaving them wide-open; but also consider implementing additional methods of protection. The aforementioned study noted that the use of encryption reduces the cost of a data breach by an average of $20,000. Food for thought.
|
-
Posted Saturday, May 16, 2009 1:37 PM |
There are many good reasons to protect the sensitive data that is in your database: government regulation, corporate policy, managing legal liability and simply being a good steward of the information that has been entrusted to you. There are also industry standards defined and enforced by non-governmental agencies that dictate the management of sensitive data. The Payment Card Industry Security Standards Council (PCI) is an example of such an organization. PCI is an organization that is comprised of representatives from American Express, Discover Financial Services, Master Card, Visa and JCB International for the purpose of establishment and enforcement of data security standards in regard to payment card data. It is through their data security standard (DSS) that they provide guidelines that aid in the consistent protection of this type of sensitive data. In the PCI DSS there are twelve requirements that must be adhered to for any business that handles or stores data that is in relation to payment card data. Details of each of these twelve requirements can be found in their document titled: "Navigating PCI DSS". For our interest as a DBA, the third requirement titled "Protect stored cardholder data" addresses the concerns of how the sensitive data that is involved in a transaction is stored. The PCI DSS specifically defines the following pieces of information as sensitive: - Cardholder's name: This is located on the face of the card and identifies the owner of the credit account.
- Primary account number (PAN): This is located on the face of the card and identifies the account in which a payment is processed.
- Expiration date: This is located on the face of the card. This value indicates when the card expires.
- Service code: Found within the magnetic strip. Defines acceptance requirements for the card. This is a three or four digit number.
- Authentication data: This includes the full contents of the magnetic strip and the verification code which is located on the back of the card. The personal identification number (PIN) which is keyed by the card holder and contained within the magnetic strip is also considered in this category.
The development of a system in which payment cards are used requires careful consideration of the data that is to be stored within various means of storage be it a database, cookies, or cache. For PCI DSS compliance the items defined as "authentication data" cannot not be stored beyond the duration of a payment card transaction. This includes the full content of the magnetic strip, the verification code and the personal identification number (PIN). The primary account number (PAN) can be stored after the payment card transaction; but it must be obfuscated through the use of encryption, truncation or other hashing methods. PCI DSS presents their requirements in a product agnostic manner, as they should. As Microsoft SQL Server DBAs one approach might be to encrypt the PAN by implementing the Transparent Data Encryption (TDE) feature of SQL Server 2008 rather than implementing encryption at the cell-level. If TDE is used then the access to the PAN must be through an account other than a local user account. If the cardholder's name, service code and expiration date is stored with the PAN additional protection efforts are required for this data. Storage of these pieces of data independently from the PAN require no additional protection for PCI DSS purposes. Therefore, careful consideration of the storage schema of these elements can reduce the complexity of the storage of this information. With the implementation of encryption comes the responsibility of key management. The importance of limiting the access to the keys that are used to decrypt the cipher text maintains the effectiveness of the encryption effort. Limiting the locations in which the key backups are stored not only ensures that when the need arises to restore a key that you have the most recent backup, it also is a way to limit who can obtain and restore the backup files. Encryption keys are not a "set it and forget it" feature. PCI DSS requires that the keys used for encryption be changed at least once a year. SQL Server does not offer native functionality of key lifecycle management; although, in SQL Server 2008 the introduction of Extensible Key Management (EKM) does provide a means to integrate a third party product that provides this function. On the surface an industry standard may not have as much teeth behind them as a government regulation. In some cases this may be true depending upon the goals of establishing the standard. In the case of PCI DSS the consequences to non-compliance is considerable. In addition to a $500K fine per incident of non-compliance, the business could be denied the ability to process payment card transactions which is a competitive vulnerability. This was a cursory overview of requirement 3 of PCI DSS. To get more specifics on what can be done through SQL Server in regard to all of the PCI DSS requirements check out the one hour "SQL Server 2008 Capabilities for Meeting PCI Compliance Needs" webcast on TechNet.
|
-
Posted Sunday, April 26, 2009 5:09 PM |
The content of blogs certainly do not fall into the category of sensitive data. They are a collection of thoughts, ideas and bits of shared knowledge that are placed in a very public location. Despite not being considered sensitive the information provided in a blog remains subject to being stolen and its integrity compromised. Recently I posted a blog that was immediately scraped by a splogger, injected with less than positive words and published on a blog at WordPress. While many of the adjectives were modified they maintained the association of my name as well as the other names that I noted within that entry. This act was very upsetting. Not only did they glean my original content and infringed on my, as well as Simple-Talk's, copyright; they also compromised my reputation by modifying positive statements about the people and organizations that I noted in my blog to negative ones. Granted most of the modifications were nonsensical; but for those who are not familiar with me may encounter this version and derive an erroneous conclusion from the blog entry. A few months ago I encountered another scenario where there were many blog entries from Simple-Talk copied verbatim on another site. While this occurrence did not modify the content, it did scramble the credit of the blog entries. I was credited for an entry on C# and XML which I did not author. Both of these occurrences were identified through a Google Alert that I setup. This alert sends me an e-mail whenever my name is mentioned on the Internet. It also provides me with the URL in which the mentioning occurred. It is indeed a very handy tool! The effort to protect this data and respond to these occurrences of theft can be quite perplexing. If you find yourself subject to being scraped, here are some suggestions: In my specific case, I blog at Simple-Talk.com along with other bloggers. It is in their interest that their content is protected. The first step that I took was to notify Simple-Talk.com that this was occurring so that they can take some actions from their side to protect their content as a whole. Another step is to seek the contact information for the offending sites. A whois search can often reveal the contact information as well as host information. There is a possibility that the scraping is from being ignorant to the nature of offending site's actions rather than being malicious. If that is the case it does present an opportunity for the site to correct their ways. If the offending splog is hosted on a shared blogging site, such as WordPress, a complaint could be submitted to the hosting company noting that the site is violating their terms of use policies. This may get the site shut down... at least until they move it to somewhere else. Many search engines offer a means to submit a Digital Millennium Copyright Act infringement notification. An example of what is involved with this approach would be the following link from Google: http://www.google.com/dmca.html. This requires some preparation of evidence as well as potentially engaging an attorney; but it provides a way of pursuing a violation through civil action. I recently ran across an article that discussed one option in the fight against scraping called cloaking. In a nutshell the concept is to insert some PHP into the content that will present alternate content if it is displayed from a specific IP Address. While this approach may not work on all sites, it does provide an interesting approach. I am interested in hearing from other bloggers who have experienced their blog being scraped and actions taken to address the issue.
|
-
Posted Monday, April 20, 2009 6:03 AM |
Many of us play a specific role within a project. We may architect a schema based upon requirements that have been gathered by another person. Once the schema is complete, the specifications of that schema is passed on to another group who designs the user interface and reporting. Yet another group performs the final testing, implementation and maintenance of the solution that was developed. While we may be somewhat familiar with the tasks that proceed and follow our specific task there are always dimensions of those roles that we do not experience. A participant in a project who has an appreciation and understanding of the context of their role will be a more effective one. Therefore; I approached Perpetual Technologies, Inc. (PTI) to see if they would be interested in co-hosting an event with IndyPASS where we would present the major aspects of a project and walk the audience through its paces giving them a holistic experience that they would not get otherwise. On Saturday, April 18th, in Indianapolis the vision became a reality. With business intelligence becoming a very popular topic we agreed that our project of focus would be a business intelligence project. Our event consisted of five sessions: - Gathering Requirements: Presented by Kristin Sheibley
- Building the Database Model: Presented by Ray Lucas
- Building ETL Processes: Presented by Arie Jones
- Building OLAP Cubes Using SSAS: Presented by Jung Choi
- Reporting Services: Presented by Arie Jones
Our approach to this event was to assemble a great team of experienced professionals to organize and develop the content of these presentations. The aforementioned presenters worked together amazingly well to provide the continuity between their presentations, provide technical review for each other's topics and develop the supporting storyline in which the technical aspects were presented. This approach to the presentation development was unique compared to most technical events; but was very effective with the team that we had assembled. My partner in arranging the logistics of the event was Caroline Bailey from PTI. With her experience of organizing the various events that PTI offers throughout the year and her keen ability to keep all of the members of the team focused, the planning for this specific event was a smooth as butter. We prepared a notepad-booklet in which we gave to each attendee that contained the schedule of the sessions, presenter biographies and plenty of space for taking notes. We utilized Lulu.com for the printing of these booklets. The quality of the printing was excellent and the price was equally impressive. I would certainly recommend them for anyone looking to produce booklets for their own events. Our target was to host 100 attendees. Our no show rate was fairly low resulting in a total attendance of 95 in which majority arrived early and stayed late for the entire event. This was especially impressive considering that we had our first nearly Summer-like weather day in what seems to be ages. Wrox and O'Reilly provided business intelligence related books for SQL Server 2005 and SQL Server 2008 as give-aways at the end of each session which are always coveted by door prize seekers. At the beginning of the event I encouraged all bloggers and tweeters in the audience to give a shout out to the event and share their experience. This is a great way to spread the word about our community's activities. To search on Twitter, go to: http://search.twitter.com Thank you to all that participated in the planning and attending this event.
|
-
Posted Tuesday, April 07, 2009 6:03 AM |
Ah yes, it is that time of year again when the cold grip of Winter loosens and the boys of summer take the field. Baseball season has begun. I am hardly one that could be labeled a sports nut. I don't follow any team so closely that I can tell you the names of their first string players or their stats. It is rare that I catch a full regular season game on television; although when the World Series is on in October I try to catch majority of those games. Despite my status of being an absent fan, baseball has always been something special to me. As a child playing a pick-up game at a neighbor's house was the standard activity during the Spring and Summer. Going to the Indianapolis Indians with my Grandfather and the whole family was a highly anticipated outing and is the source of many great memories. I remember watching the Chicago Cubs and the Cincinnati Reds on television and standing in the living room imitating Pete Rose's or Johnny Bench's swing at the plate. As an adult the opportunity to take in an Indians game has become a rare treat; but one that I plan for at least once or twice a year. The whole experience of attending a baseball game with your family and friends has a Zen quality to it. Shouting out the song "Take Me Out To The Ball Game" with over a thousand other fans at the end of the seventh inning is great fun. The thrill of that ball being knocked deep into the outfield at the bottom of the ninth inning is energizing. Mostly its about taking a few hours out of a busy schedule and getting back in touch with that inner-child and creating new memories with your family and friends. Play ball!
|
-
Posted Tuesday, March 31, 2009 6:55 PM |
In the energy circles there is a common reference to what is called the "carbon footprint". This is an individual's or business's direct contribution to climate change resulting in greenhouse gases. In the environmentalist circles there is the "ecological footprint". This measures the amount of land and sea that is used by an activity or population. Finally there is another measurement that references the amount of water used by an individual called the "water footprint". The computing world, not to be left behind in the footprint trend, has what is called "digital footprint". This is the measurement of the data that is generated as the result of an individual's activity within a network, cell phone or on the Internet. This includes information or images that are directly posted when posting on your favorite social networking site. Your digital footprint even expands to the surveillance cameras that capture your image in public location. The more data that is captured from your digital activities, the larger your footprint. The larger your footprint the higher your risk of that information being observed by unintended parties. EMC offers a digital footprint calculator that walks you through a series of questions to reveal a score. Ironically, you must download this item and install it before use. Another means to get an idea of your online digital footprint is to simply search for your name on your favorite search engine and see what comes up. When I search my name on Google, there are 1,570 pages returned. Not all of these pages pertain directly to me; but with a little reviewing and a little knowledge of me it is not difficult to identify the ones that do. The effort of maintaining a small digital footprint does not imply that you must become a digital or social hermit. There are many benefits in utilizing the digital tools that are available to us. Pay close attention to the content of the digital information that you actively output. Brad McGehee has posted several blog entries in the past on developing your online brand. Seeing the data that is available online as a brand building process is an excellent way to manage your digital footprint. Here are some ways you can reduce or improve the quality of your digital footprint: - Avoid posting pictures on the Internet that you wouldn't show your parents or employer.
- Be cautious about the wording of critical statements or complaints.
- Avoid revealing information about your job or employer that is not public.
- Be cautious about the personal information that you share about yourself.
- Be cautious about the personal information that you share about your friends and family.
- Be cautious about sharing too much of your daily routines and travel plans.
- Utilize privacy settings on your browser or various web sites that offer them.
Remember that your personal information and reputation is an asset. Once damaged or exposed it is very difficult to control its use. Besides, it was George Bernard Shaw who once said: "The things most people want to know about are usually none of their business."
|
-
Posted Monday, March 23, 2009 7:38 PM |
Recently my video titled "Create an Asymmetric Key in SQL Server 2005 / 2008" was featured on JumpstartTV. A superb feature of JumpstartTV is that the viewers of the videos can leave feedback and questions in which the author of the video can utilize to further the learning. Among the feedback of this specific video was a reoccurring question about handling lost or forgotten passwords for the asymmetric key. As a quick refresher: When an asymmetric key is created you have the option to protect the key with a password. The password is required to open the key for decryption as well as modification of the key itself. Once the key is used to encrypt data its decryption is dependent upon that key. We are all very familiar with login passwords. These are keyed in by end users to gain access to a system, network or database. Any network administrator would tell you that having to reset forgotten or lost passwords is a regular task. Therefore a natural question that arises when discussing key passwords is how to handle the occasional forgotten or lost password. Unlike a login password, the key passwords are not typically keyed in by the end user. These passwords are passed to the decryption method programmatically. A developer may incorporate the key password in the user interface code and pass it to the database. The DBA may maintain a user defined function or a key vault that returns the key password when called from a stored procedure that performs the decryption. The end user is not aware of the password requirement. If the key password is lost or forgotten, it is a very dire situation. There are no reset functionality associated with the password for encryption keys. With the lost password so goes the data that the key secured. A question that may arise at this point might be: "If this unfortunate situation occurs can the data be decrypted without the key in which it was encrypted?" The intent of encryption is to protect data so that the possession and use of the key that encrypted it is required. If a keyless decryption attempt is successful, you may want to seek another method of encrypting your data. No doubt, this is a scary prospect. For some the potential for this results in the rejection of utilizing encryption; but there are a couple of items to be aware of that does reduce or prevent this situation from occurring: Database Master Key Consider protecting your key with the Database Master Key (DMK) instead of a password. A database master key a key that is created at the database level and is used to protect various keys within the database. For more information about SQL Server Encryption Key Hierarchy, check out the following link: http://technet.microsoft.com/en-us/library/ms189586.aspx Once a asymmetric key is created you can change the key protection from password to DMK by executing the ALTER command. Below is an example for altering an asymmetric key to use the DMK: ALTER ASYMMETRIC KEY MyAsymmetricKey WITH PRIVATE KEY ( DECRYPTION BY PASSWORD = [Current Password] ) The inclusion of the "Decryption By Password" opens the asymmetric key with the current key password. By not including the "Encryption By..." option the key will be protected by the DMK. Key Management The "set it and forget it" approach to managing your keys increases the potential loss in the event that the key password is lost or forgotten. Rotating the keys through a key lifecycle, retiring keys and generating new ones, reduces this risk. SQL Server 2008 offers Extensible Key Management which provide the ability to utilize third-party tools that are specifically designed for key management tasks. This specific blog entry focused on asymmetric keys; but the same considerations apply when using symmetric keys or certificates.
|
-
Posted Thursday, March 12, 2009 5:45 AM |
Federal identification numbers and credit card numbers are regulars when it comes to encrypting data. The US plain text version of the prior is eleven characters. The latter is typically sixteen characters. When these are encrypted using a symmetric key the size of the cipher text will range from thirty-nine to sixty eight bytes depending upon the selected algorithm. These lengths hardly raise the question of whether the cipher text will result in a breach of the size limit for a row. In a nutshell, the row size limit in SQL Server can be explained as follows: - The limitation of the row size is a page.
- A page comprises of 8 kilobytes (8,060 bytes).
- 8 kilobytes is approximately 8,000 characters.
There may be times when the data that is requiring encryption is very large. Book manuscripts, official documents, schematics, blue prints, images, and archived e-mail messages are just a few examples of data that are stored in a database and can easily be larger than the maximum row size. Varbinary The Maximus Encrypted data is stored in the database as a varbinary data type. When defining a column of this data type you have the option to define the size explicitly from 1 to 8,000. You can also define the size as "max" which is 2,147,483,647 bytes (2^31-1) which is approximately 266,437 pages. The behavior of data that exceeds the row size limit when the "max" size option is utilized depends upon how the sp_tableoption system stored procedure is set for the table in question. If the "large value types out of row" option is set to the value of "1", the varbinary value will automatically be stored on a separate page whether the initial row has reached the row limit or not. If the option is set to "0" the data will be stored on a separate page only if it exceeds the row size limit. The subsequent pages that are used are linked to the initial page with a 16 byte pointer. Using War and Peace Leo Tolstoy's famed book "War and Peace" often is used as an example of a very long book. The English translation of this book contains approximately 3.1 million characters. This characteristic made it's text an excellent candidate for illustration of encrypting large text. Consider the first chapter of this book which, in plain text, is 6,566 bytes in length. Using a symmetric key with the AES_256 algorithm, the results is a cipher text that is 6,612 bytes in length. This certainly fits within the row size limitations; but does not allow for much additional data to be stored within that same row. Gotcha! Based upon the definition and behavior of the varbinary(max) data type, the entire text of "War and Peace" should fit within approximately 388 pages. This is certainly within the boundaries of the "max" size limitations. Although, when attempting to encrypt the entire manuscript of this legendary book, using the very same symmetric key, we receive the following error message: Msg 8152, Level 16, State 10, Line 13 String or binary data would be truncated. The devil in the details is that the output of the ENCRYPTBYKEY method is a maximum size of 8,000 bytes. Therefore despite our column being set to accept cipher text larger than the row limits the encryption process in SQL Server does not allow us to create cipher text of that size. This is true for the other encryption methods including the HashBytes method. What, Me Worry? One possible solution to this challenge would be to break up the large text into segments of less than 8,000 bytes prior to encryption. Book manuscripts have logical break points. Each chapter can be encrypted and stored separately. Implementing a normalized table structure will provide an flexible and organized method of storage. Another approach might be to seek encryption options externally from the database. Encrypting the large text before it is introduced to the database could yield a broader list of options. A final consideration is to challenge why the large text requires encryption. The intent to obfuscate data does not necessarily call for encryption. As an alternative, consider converting the plain text directly into a varbinary data type by utilizing the CONVERT method. For example: INSERT INTO MyTable ([MyVarbinaryColumn]) VALUES (CONVERT(Varbinary(max),@YourPlainText)) This statement will return a varbinary version of the plain text that is equal to the full length of the string. It's resulting value is not discernable to the naked eye. To return this value to plain text requires only the conversion of the varbinary(max) data to a varchar(max) data type using the same CONVERT method. This approach does not offer the security level that encryption offers; but it may suffice for the intended obfuscation while overcoming the encryption method limitations.
|
-
Posted Friday, March 06, 2009 7:45 AM |
I saw their smiling faces on the latter pages of the comic books that I enjoyed in my youth. The whole family of strange looking, and yet extremely happy, creatures called Sea-Monkeys. Next to the extraordinarily intriguing x-ray glasses it was the most coveted oddity available in kid-dom. It was my extreme pleasure one day to discover that a packet of Sea-Monkeys were in my possession. I eagerly opened the paper envelope and dumped its dust-like contents into a glass of water. I studied carefully each day on how this mysterious cryogenic material would turn into the beings that grinned so fervently from my comic books. Excitement overcame me when I noticed that there were very tiny shrimp tossing about. I had to squint tightly; but they were there. It is through carefully studying the nearly microscopic materials that floated within that glass to realize that they were living beings rather than random particulate. Taking the opportunity to contemplate some of the minute details of an item can increase its enjoyment. It also can reveal secrets to how something can be used in a more creative manner. The variance of hashes that are generated from SQL Server encryption methods are among these minute details. HashBytes Consider the output of encrypting the plain text value of "My Sensitive Value" by using the HashBytes method. The algorithm options that are available are MD2, MD4, MD5, SHA and SHA1. With the execution of the following statement we can view the hash values for each algorithm: SELECT HASHBYTES('MD2','My Sensitive Value') as MD2, HASHBYTES('MD4','My Sensitive Value') as MD4, HASHBYTES('MD5','My Sensitive Value') as MD5, HASHBYTES('SHA','My Sensitive Value') as SHA, HASHBYTES('SHA1','My Sensitive Value') as SHA1 The following are the results of this statement: MD2: 0x0EF80F89B3ADF5C828DDFB89E3A48415 MD4: 0x84CFCB9C2D937BF521FFB3BC8E1C48E8 MD5: 0x3C2A7B3FA3DE5215A7F8991798C0641C SHA: 0xA6E9747FA2698611D6D5FC7F07A02984924C9A70 SHA1: 0xA6E9747FA2698611D6D5FC7F07A02984924C9A70 The uniqueness of these hash values are the result of the algorithm that is used. If the HashBytes method were to be executed a second time the method would return the identical hash values. This is valuable since the HashBytes method is a one-way encryption method. To reveal the contents of this hash a plain text value would have to be hashed and then compared to a previously hashed value. If consistently returning the identical hash was not the behavior of this method one-way encryption would not be possible. This behavior is also a vulnerability to this method since patterns can be identified in the body of data that contains these hash values. Appending your plain text with a unique value, such as a primary key value, before producing a hashed value is called salting. This method can be used to simulate a unique hash value for each individual occurrence of the plain text. Symmetric Key In the study of the EncryptByKey method, for the plain text value of "My Sensitive Value", we will use a symmetric key. The algorithm options of symmetric keys are DES, TRIPLE_DES, TRIPLE_DES_3KEY, RC2, RC4, RC4_128, DESX, AES_128, AES_192 and AES_256. In the statement below we will select a symmetric key that uses the AES_128 algorithm: OPEN SYMMETRIC KEY SymKey DECRYPTION BY PASSWORD = 'MyStr0ngP@ssw0rd' GO SELECT ENCRYPTBYKEY(KEY_GUID('SymKey'),'My Sensitive Value') GO CLOSE SYMMETRIC KEY SymKey GO The following is the results of this statement: (Due to the length of the hash I have truncated them for demonstration purposes) 1st execution: 0x0082145786AF1C7DEC691ED...8A5EBF6A25523079AFAB46D4 2nd execution: 0x0082145786AF1C7DEC691ED...EB8AB300B87B81378635AA9E Unlike the hash values produced by the HashBytes method, a different hash value is returned for the same algorithm with each execution. The specifics on how these hash values are determined and why they are different after each execution varies depending upon the algorithm selected. For this example the AES_128 algorithm performs multiple rounds of transformation before presenting its final hashed value. The hash values produced by asymmetric keys will produce similar variances in hashed values as symmetric keys. The algorithms available for asymmetric keys are: RSA_512, RSA_1024 and RSA_2048. Understanding the hashing behaviors of these algorithms aid tremendously in the decision of which one to utilize in your encryption efforts. While details of encryption, such as the variance of hashed values, may not be as entertaining as a glass full of Sea-Monkeys it does reveal a world otherwise unseen by the casual observer. This world, when revealed, can hold the key to more effectively understanding and utilizing the features of SQL Server.
|
-
Posted Monday, March 02, 2009 5:58 AM |
Over the past month I have working on a collection of training videos. These videos cover various "how-to" topics related to encryption and data security. These videos are available through my profile on JumpstartTV. This morning, March 2, my "Honeycombing in SQL Server 2005/2008" was featured. The title is a bit of a trick since implementing honeycombing in SQL Server 2005 is very difficult, if not impossible. With the introduction of the audit feature of SQL Server 2008 honeycombing is not only possible but very easy. The experience of creating videos is a new one for me. I did enjoy creating them despite a few short periods of frustration. A lesson that was learned through the process was that scripting the dialog was very helpful. My early attempts were to speak ad hoc as if I was in a conversation; but this did not prove to be effective. The scripting allowed me to organize my thoughts and significantly reduce the Shatner-like pauses, periods of dead-air and the occurrences of "uh...". Another lesson was to work on a couple and then step away. I found that after a couple of hours of speaking into the microphone that I began to stumble and transpose words at an increasing rate. I attribute this to fatigue; but it may be that I just need more practice. The first day was the hardest in this regard since it was a full-day sprint in the creation of these videos. A light-hearted lesson that was learned was not to drink Ginger Ale (or any other carbonated beverage) before recording. This action lead to a few re-starts in the recording process; although it did provide some interesting versions of the videos that did bring on a bit of hilarity. My thanks to Andy Warren who extended an invitation to this opportunity as well as his eternal patience and helpful coaching. Also, a thanks to Chris Rock who polished up my videos and made them look slick with his editing skills. I do plan on preparing more videos in the near future. I would encourage anyone who may be thinking of creating a few of these tid-bits of knowledge to consider JumpstartTV.
|
-
Posted Wednesday, February 25, 2009 8:18 PM |
In my hometown of Indianapolis the Winter weather can get very cold. Earlier in this season we had a day where the temperatures dropped to -12 degrees Fahrenheit. The strategy that is employed when voyaging out into the frozen tundra is to dress in layers. By dressing in layers you are adding protection between your delicate flesh and the harsh frigid air. If the layers are insufficient the protection is breached and you suffer the wrath of frostbite. The strategic approach to protecting your sensitive data is very similar to managing your Winter outings. Any single security effort, be it encryption, hashing, encoding, role-based permissions, or management policies, will fall short in the protection of sensitive data. The dawning of multiple security measures reduce the occurrence of an unauthorized disclosure attempt. When selecting the security features that are to be applied it is important to understand the intended role of each feature. All features have their strengths, vulnerabilities, and specific role to fill in the security strategy. None provide an all encompassing security solution. I recently gave a presentation on the topic to Transparent Data Encryption (TDE). A large part of the Q/A portion of the evening was discussing the intended role of TDE. At a base level, TDE is intended to protect the physical files (data file, transaction logs, and backup files) of the database. When a TDE protected database is active on a server it is beyond the scope of that feature. When a TDE protected database is placed upon a backup device or media the power of this feature shines. The combination of TDE and other features, such as cell-level encryption or one-way encryption, is a greater solution than either option in solitude. Sensitive data protection at the database as well as at the user interface is greater still.
|
-
Posted Wednesday, February 18, 2009 8:37 PM |
I was recently tagged in a blog entry from Tim Mitchell titled "Things I Wish I Had Known". The concept appears to have started with an entry of a similar, but lengthier, title by Mike Walsh and passed along to other SQL bloggers. The idea is to bestow some wisdom that would benefit someone who is starting out in our profession. Here are a few morsels from my experience that I offer: Ask, Ask and then Ask Again The development process at times fells like trying to translate hieroglyphics without the Rosetta stone. This is true whether you are developing a database or the user interface of an application. Customers will present their needs in a way that they understand it. The developer will attempt to interpret the request into a system that meets their needs. You can be sure that the customer's request is never as simple as they present it and it is never as simple as the developer interprets it. The key to success in this task: Ask questions as if you were a layer. Improve Processes Before Automating Them I have been the recipient of many projects where the requested system is pursued as the savior of a severely broken process. Rather than go through the process of reviewing the process and identify the opportunities to improve it the thought is to throw code at it to speed up the excruciating and time consuming manual processes. This often results in a system that offers very little improvement in the situation and large masses of missing hair on the part of the developer and the requestor. He Who Is Not Busy Being Born Is Busy Dying This line, pulled directly from a Bob Dylan song, is a something that I live by on a daily basis. The opportunities to learn from another person is in abundance. The opportunities to teach another person flourishes equally. The willingness to accept that failure will occur leads to the elevation of experience that draws nearer to expertise. Identifying opportunities to improve existing skill sets is critical for growth. It is when the belief that there is nothing more the learn is when complacency begins to rear its ugly face. Support Your Local User Group and Events User groups and community organized events such as Tech Fests, SQL Saturdays, and code camps offer great learning opportunities for free, or at nominal cost. These events also provide the opportunity to network with your peers which is priceless. In addition, the opportunity to explore your speaking skills are provided through these venues. At The Beginning of the Day Make A Top 5 List At one of my previous employers the practice of making a daily top five list of tasks, in order of priority, was implemented by the CIO. This list is intended be the focus of the day. At first the idea was met with some resistance by the entire department; but I found it very valuable in organizing my day. Limiting the list to five makes it very manageable. I knew that if I go home with all five items crossed off my list it was a productive one. My offering is rather short compared to others that have participated in this chain-of-blogs. As the trend continues providing something unique becomes more challenging. There have been many great pieces of advice passed along. Hopefully they fall into the hands of a very fortunate SQL Server beginner. I will do my part and entrust (tag) Brad McGehee, Jimmy May, and Arie Jones to be the carriers of this vessel of good advice. Blog on my friends!
|
|
|