SOTM 33 Answers

The binary uses several means to thwart reverse engineering efforts. Throughout, it uses the Windows exception mechanism to throw/catch exceptions. In the initial stages of the binary, the x86 instruction RDTSC is called prior to each exception so that eax contains clock value prior to the exception being triggered. In the exception handler, all x86 debug registers are reset to zero by directly manipulating the Windows CONTEXT structure who's pointer is passed on the stack. This foils hardware assisted debuggers. Additionally, the RDTSC instruction is used again in the exception handler to see if too much time (0xE0000 ticks) has passed (for example if a debugger has paused the program). If too much time has passed, the eip value saved in the CONTEXT record is adjusted in such a way as to cause the program to crash. After a number of these exception handling cycles, the program enters a series of decoding loops that in effect peel away the layers of an onion (about 175 in all). At this point what remains is effectively a custom virtual machine (I dubbed it NVM for Nico Virtual Machine) which takes over for the remainder of the program. The first thing the NVM does is decode the innermost portion of the program which turns out to be a Nico program for the NVM to execute. To unravel the program this far, the reverse engineer must gain an understanding of how NVM instructions get executed. To frustrate this, the NVM also makes extensive use of Windows exceptions and attempts to hide the actual state of the CPU registers by inserting, on average, 170 useless instructions between each instruction that actually does something for the NVM.

The code that is actually being protected is a virtual machine program (I dubbed this programming language Nico). The binary includes the virtual machine (NVM) to execute this code. In order to understand exactly what the protected program does, it is first necessary to gain an understanding of how the NVM virtual machine works. To reverse engineer this program, it is not enough to understand what all of the x86 instructions are doing, it is also necessary to understand what the data (the compiled Nico program) will cause the the program to do. For more information see the analysis page.

Once it is recognized as a virtual machine, the quickest way to analyze it is to devote time to determining the instruction set of the virtual machine. Once the instruction set has been determined, the embedded byte code can be disassembled and analyzed directly without having to deal with any anti-reverse engineering techniques incorporated into the virtual machine interpreter.

Static analysis tools such as disassemblers. Those with scripting capabilities are a plus as scripts can be used to handle the most repetetive tasks such as decoding loops. Anti-debugging tricks such as zeroing debug registers and measuring elapsed time are ineffective against static disassembly. Emulators are also of great value as they are also generally immune to these same tricks.

The binary expects a password to be supplied on the command line. I recovered the password by reversing the virtual machine program. By observing how the Nico program handles its comand line input it is possible to understand the following

Portions of the password are used in four separate ways
The first 4 characters of the password are treated as a 32 bit int and after a series of additions and subtractions are compared against an internally generated value. The first four characters of the password must be "1D3N" or 1D3n"
The 5th and 6th characters are treated as a 16 bit int and xor'ed against 2 bytes of virtual machine program information required to cleanly termiinate the program later in its execution.
The 5th and 7th characters must sum to 0xA8
The 6th and 8th character must sum to 0x8F

One password that works is "13DNEKcD". For more information see the analysis page.

Bonus Question:

Advanced techniques seen in the wild include some of the following. The name(s) of the obfuscator(s) that implements the technique appear in parenthesis.

"locking" protected binaries to specific hosts by generating keys based on information gathered from the infected host at runtime. (burneye)
On-demand decryption. The entire binary is never fully decrypted at any one time. In a manner similar to virtual memory paging systems, the binary is subdivded into small blocks and only a few blocks are decrypted at any given time. When eip runs off the end of a block, a "page fault" occurs and new blocks are decrypted and paged in to run while old blocks are cleared out of memory. (armadillo, shiva)
Strong encryption without embedded keys. The binary asks for a password after it starts. The password cannot be guessed or reverse engineered without resorting to a brute force attack against the crypto algorithm employed. (shiva, burneye)
Instruction replacement. Instructions are replaced with int3. Internal handlers deal with the resulting exception, and noting the location of the interrupt, the replaced instruction is emulated by the handler before returning from the exception. With this technique, capturing the decrypted binary fails to capture every instruction because some instructions have been overwritten by int3 and a significant reverse engineering effort is required to determine what instruction has been replaced. (armadillo, shiva)

Honeynet Scan Of the Month 33 Questions and Answers

"Evil Has No Boundaries !"

Bonus Question: