In this guest editorial, our very own Chris Allen brings us the first of hopefully many Thoughts from the Help Desk. As Halloween has just gone by, he decided it would be appropriate to discuss black magic, a frightening thought, and how to bring your source code back from the dead.
Attention all developers! Unless you're using a military-grade code-protection environment or a seriously sophisticated code-protection tool, your code is vulnerable to decompilation and disassembly; in short: all your source are belong to us. But everyone knows this, right? I didn't think I'd need to give anyone a wakeup call but, as I found out recently on the help-desk, there's still a smattering of people who still think compilation equals obfuscation. I have some sympathy with such n00bs; I can still remember the awe in which I held the first decompiler I ever used- one for the Psion's EPOC language. I wondered how on earth it was possible to reverse the process of compilation; Type and Data Flow Analysis sounded like so much black magic. But, as in law, ignorance is no defense. It will always be the case that dynamic code is vulnerable to decompilation and discovery (with the honorable exceptions above), and this is especially true with interpreted languages. So, I write this so that we can all draw a line under any illusions we've clung to, deal with this fact of developer life and then - embrace it! Soon enough, you'll be very glad of it - for example, recently my colleague gave me a small application that did a great job of customizing our SQL Compare engine. He only had the runtime assembly and had lost the source code but I really needed to understand what he had done - one quick flick of the wrist later and Reflector had not only recovered the source, but had created the Visual Studio project for it too (I *really* love that feature)! And I've heard many other stories about why Reflector is so useful - the one comment that sticks in my mind is from the developer who said, "If it wasn't for Reflector, I'd be doing a different job".
So how does this 'Dark Art' work? As I say- code has always been vulnerable and "reversible" but, until the invention of the technology behind intermediate languages (such as the .NET languages), the job of reversing the code was akin to decryption (think WWII, "Enigma" code-breaking; better still, don't think - just watch the film :-) ). Intermediate languages don't directly generate the machine code (which is the really hard bit to reverse-engineer)- they generate "IL", each line of which has a reasonably clear derivation (often a one-to-one correspondence with source code, in fact). This intermediate level of code is generally relatively easy to pull apart, and if you find this technology a little frightening, it's maybe comforting to know that decompilation is not an exact science. There isn't a 100% one-to-one correspondence and, sometimes, decompilation is equivalent to the classic Halting problem. It's still hard to do very well, but we think we're still on top of the game. Welcome to Reflector Pro.
-Guest post by Chris Allen