16.2 Decompilation

Traditionally compilation was a one way affair32 and indeed most of paid programming revolves or perhaps revolved around this concept. As people continue research into computer science and people use higher level languages decompilation which is the act of turning a binary file into source code (probably and sadly lacking comments) becomes ever more viable. This is not to be confused with reverse engineering as a whole which has always been able to be done (indeed most of this guide aims to teach methods to do this) although a large amount of the time to do it is often necessary.

Interpreted languages By and large anything that is not directly related to C or Assembly is probably going to be a scripting/interpreted language as opposed to a truly compiled language (lines can get very blurred though as C# will probably demonstrate). However rather than leaving it as human readable code there will be a conversion to something known as bytecode (it is still faster to manipulate nice selections of numbers of known lengths than parse a complete selection of human readable text which could be any length although there are also tools that help shorten this) which will eventually get turned into assembly instructions (sometimes at the start of running and sometimes just before it is needed- a technique known as Just In Time compilation) but said bytecode can frequently be turned back into source code. There are countless interpreted languages but if you search for decompiler and the interpreted language you want to decompile you will usually get something. Naturally there are ways to intentionally and unintentionally obfuscate your program and indeed some of the interpreted languages runtimes offer methods by which to do this at various levels.

C# Although C# is strictly speaking a compiled language much like the “predecessors” C and C++ and other members of the C family it comes with a very large collection of libraries and runtimes (one of the main reasons for it to be created was to in fact provide a standard collection of them to stop programs having to have many and varied versions all over a system) which can be called upon by programs using the language. Knowing this several tools have been made to remove the calls and formulate how they are called leaving just the actual custom code that was created in the first place. A variety of tools exist for it with some of the more popular ones being ilspy (open source), dotPeek (freeware) and .NET Reflector (paid)

C The decompilation of C++ is not that far advanced at this point in time but the decompilation of C is somewhat more advanced than it has been in the past and tools like REC used in conjunction with the debugging type methods above can do a lot towards getting away from assembly.


  1. There is a problem known as the halting problem which revolves around the fact that you can produce an algorithm that can not produce a result from every combination of inputs (the classic if slightly wonky example would be what is the square root of negative one?). In practice this is typically seen you can not evaluate a program for every possible input and it also relies on the idea that human input is kind of hard to mimic/account for and is relied on for the program to function. However you can approximate solutions and run programs in an attempt to get a typical output or constrained set of inputs (although you can put inputs into a function which causes an error much of modern programming is designed to prevent from happening - it often being both the causes of crashes and means by which hackers can do what they do) which is what decompilation research has put of a lot of effort into not to mention modern X86 processors (and compilers and coding techniques for them) that try to predict the most logical path and do it before it is asked to.↩︎