Click here to monitor SSC

Damon Armstrong

Caffeine Induced Tirades about .NET and Life
And don't forget to check out my latest Simple-Talk articles
View Damon Armstrong's profile on LinkedIn      Add to Technorati Favorites      Add to Google     

Writing a Byte Array to a Hexadecimal String

Published Saturday, July 17, 2010 9:29 AM

I was finishing up work on a hashing library and started testing my hash values against other sources to ensure I was doing everything right.  Unfortunately, my hashes were off.  Long story short, I was converting the hashed byte array into a string using a Base64 string encoder, but what you are supposed to do is convert it into a Hexadecimal string - that is if you want to conform to standards.  I searched around and found out how you are supposed do this from post called How do you convert Byte Array to Hexadecimal String, and vice versa, in C#? on Stack Overflow, as well as a post from MSDN entitled byte[] Array to Hex String.  In the first post, two options are presented, but no information about which way is faster or why to do one way over the other.  So I decided to find out myself because if I have the option, I'd prefer my code to be faster.

Option 1: Building the hex string using a for loop
In my  meanders looking for a solution to the MD5 hash, this is the algorithm that I ran into the most.  The method simply runs through each byte in the byte array and outputs the text based version using standard string formatting with a StringBuilder.

public static string CreateHexString(byte[] data)
{
  StringBuilder hex = new StringBuilder(data.Length * 2);
  foreach (byte b in data)
  {
    hex.AppendFormat("{0:x2}", b);
  }
  return hex.ToString();
}

Option 2: Using the bit converter and replacing the delimiters
The second option uses the BitConverter class to create a string - but the string that gets created is delimited by a series of dashes that must subsequently be removed.  This is some very succinct code.

public static string CreateHexString(byte[] data)
{
  return BitConverter.ToString(data).Replace("-","");

}

Option 3: Dark Magic
This piece of code from PZahra from the MSDN link uses right shifting and dark magic that culminates in a hexadecimal string.  I modified it slightly from the original code so it would match the strings produced by the other hexadecimal methods.

public static string CreateHexString(byte[] data)
{
  char[] c = new char[data.Length * 2];
  byte b;
  for (int y = 0, x = 0; y
  {
    b = ((byte)(data[y] >> 4));
    c[x] = (char)(b > 9 ? b + 0x37 : b + 0x30);
    b = ((byte)(data[y] & 0xF));
    c[++x] = (char)(b > 9 ? b + 0x37 : b + 0x30);
  }
}

Using my performance timer from a previous post (though it's been slightly modified since then) I ran both options through a series of tests and determined the following:

Option 1 runs approximately 40,000 ops/sec
Option 2 runs approximately 282,000 ops/sec
Option 3 runs approximately 304,000 ops/sec

So, using the for-loop is the slowest possible way of doing things, which I find interesting because it seems to be widely used for this purpose.  Opting for the BitConverter is more than 7 times faster than the for loop (at least on my system) and it only takes a single line of code to write.  Perhaps the BitConverter is a bit obscure (pun definitely intended) so nobody uses it.  And if you don't mind losing a lot of readability, then you can eek out even more performance using PZahra's code. 

by Damon

Comments

 

Federico said:

Converting a byte (array) to hex in C or assembly is usually done in the third way (black magic), and relies on the ASCII character order.
Knowing that '0' - '9' have code 48 - 57 (0x30 - 0x39) means you just need to add 0x30 if the nibble is lower than 10 (nibbles are the two 4-bits parts of a byte); 'A' - 'F' have code 65 - 70 (0x41 - 0x46), in that case you add 55 (0x37, 65 - 10).
If you have a bit more memory to spare, a lookup table simplifies things a lot: either use a 16 entries array ['0', ..., '9', 'A', ... 'F'] and do the bit shift magic, or use a 256 entries array with all the 2-chars combinations ['00', ..., 'FF'].
The lookup tables should be the fastest possible, but profile first to verify it ;-)
August 16, 2010 10:46 AM
 

Nisus said:

Damon, please fix the code in third sample. It seems that the for-loop is incomplete :(
August 17, 2010 7:46 AM
You need to sign in to comment on this blog
<July 2010>
SuMoTuWeThFrSa
27282930123
45678910
11121314151617
18192021222324
25262728293031
1234567
Automated Script-generation with Powershell and SMO
 In the first of a series of articles on automating the process of building, modifying and copying SQL... Read more...

Converting String Data to XML and XML to String Data
 We all appreciate that, in general, XML documents or fragments are held in strings as text markup. In... Read more...

Geek of the Week: Don Syme
 With the arrival of F# 3.0 Microsoft announced a wide range of improvements such as type providers that... Read more...

How to Document and Configure SQL Server Instance Settings
 Occasionally, when you install identical databases on two different SQL Server instances, they will... Read more...

What's the Point of Using VARCHAR(n) Anymore?
 The arrival of the (MAX) data types in SQL Server 2005 were one of the most popular feature for the... Read more...