Damon Armstrong

Caffeine Induced Tirades about .NET and Life
And don't forget to check out my latest Simple-Talk articles
View Damon Armstrong's profile on LinkedIn      Add to Technorati Favorites      Add to Google     

Writing a Byte Array to a Hexadecimal String

Published Saturday, July 17, 2010 9:29 AM

I was finishing up work on a hashing library and started testing my hash values against other sources to ensure I was doing everything right.  Unfortunately, my hashes were off.  Long story short, I was converting the hashed byte array into a string using a Base64 string encoder, but what you are supposed to do is convert it into a Hexadecimal string - that is if you want to conform to standards.  I searched around and found out how you are supposed do this from post called How do you convert Byte Array to Hexadecimal String, and vice versa, in C#? on Stack Overflow, as well as a post from MSDN entitled byte[] Array to Hex String.  In the first post, two options are presented, but no information about which way is faster or why to do one way over the other.  So I decided to find out myself because if I have the option, I'd prefer my code to be faster.

Option 1: Building the hex string using a for loop
In my  meanders looking for a solution to the MD5 hash, this is the algorithm that I ran into the most.  The method simply runs through each byte in the byte array and outputs the text based version using standard string formatting with a StringBuilder.

public static string CreateHexString(byte[] data)
{
  StringBuilder hex = new StringBuilder(data.Length * 2);
  foreach (byte b in data)
  {
    hex.AppendFormat("{0:x2}", b);
  }
  return hex.ToString();
}

Option 2: Using the bit converter and replacing the delimiters
The second option uses the BitConverter class to create a string - but the string that gets created is delimited by a series of dashes that must subsequently be removed.  This is some very succinct code.

public static string CreateHexString(byte[] data)
{
  return BitConverter.ToString(data).Replace("-","");

}

Option 3: Dark Magic
This piece of code from PZahra from the MSDN link uses right shifting and dark magic that culminates in a hexadecimal string.  I modified it slightly from the original code so it would match the strings produced by the other hexadecimal methods.

public static string CreateHexString(byte[] data)
{
  char[] c = new char[data.Length * 2];
  byte b;
  for (int y = 0, x = 0; y
  {
    b = ((byte)(data[y] >> 4));
    c[x] = (char)(b > 9 ? b + 0x37 : b + 0x30);
    b = ((byte)(data[y] & 0xF));
    c[++x] = (char)(b > 9 ? b + 0x37 : b + 0x30);
  }
}

Using my performance timer from a previous post (though it's been slightly modified since then) I ran both options through a series of tests and determined the following:

Option 1 runs approximately 40,000 ops/sec
Option 2 runs approximately 282,000 ops/sec
Option 3 runs approximately 304,000 ops/sec

So, using the for-loop is the slowest possible way of doing things, which I find interesting because it seems to be widely used for this purpose.  Opting for the BitConverter is more than 7 times faster than the for loop (at least on my system) and it only takes a single line of code to write.  Perhaps the BitConverter is a bit obscure (pun definitely intended) so nobody uses it.  And if you don't mind losing a lot of readability, then you can eek out even more performance using PZahra's code. 

by Damon

Comments

 

Federico said:

Converting a byte (array) to hex in C or assembly is usually done in the third way (black magic), and relies on the ASCII character order.
Knowing that '0' - '9' have code 48 - 57 (0x30 - 0x39) means you just need to add 0x30 if the nibble is lower than 10 (nibbles are the two 4-bits parts of a byte); 'A' - 'F' have code 65 - 70 (0x41 - 0x46), in that case you add 55 (0x37, 65 - 10).
If you have a bit more memory to spare, a lookup table simplifies things a lot: either use a 16 entries array ['0', ..., '9', 'A', ... 'F'] and do the bit shift magic, or use a 256 entries array with all the 2-chars combinations ['00', ..., 'FF'].
The lookup tables should be the fastest possible, but profile first to verify it ;-)
August 16, 2010 10:46 AM
 

Nisus said:

Damon, please fix the code in third sample. It seems that the for-loop is incomplete :(
August 17, 2010 7:46 AM
You need to sign in to comment on this blog



















<July 2010>
SuMoTuWeThFrSa
27282930123
45678910
11121314151617
18192021222324
25262728293031
1234567
Minesweeper in T-SQL
 Whatever happened to the idea that programming in TSQL can be fun? A Simple-Talk reader contributes an... Read more...

SQL Source Control: The Development Story, Part II
 When creating SQL Source Control, the team had to make decisions as to which source control systems the... Read more...

Raw Materials: Healthy Caution or Something Else?
 Derek slips a cog. Read more...

The DIS-Information Principle, Part II
 Database design simply involves populating a schema with tables that model sets of entities and... Read more...

OCS Disaster Recovery, Part 2
 There are several possible disasters which might happen to your Office Communications Server... Read more...