Scan of the Month 33 - Peter Ferrie Start: 4-November-2004, 8:30am End: 4-November-2004, 12:16pm 1. Identify and explain any techniques in the binary that protect it from being analyzed or reverse engineered. Loading: - "malformed" PE structure crashes many analysis tools and debuggers Within the file is a section called "NicolasB" which has a negative physical size. Many tools do not support this. Behaviours include an out-of-memory error (Interactive Disassembler Professional), and a BSOD (Soft-Ice). IDA can be coerced into loading the file by using the "manual load" option, and not loading the "NicolasB" section, but it's a waste of time (see below). User-mode debuggers such as Turbo Debug 32, will load the file without difficulty, since the smaller of virtual size and physical size is used to fill the section, and since the virtual size is the smaller of the two (and valid), no error occurs. The NumberOfRvasAndSizes is also a very large value, which causes PEDump to crash, but the Windows loader treats it as an unsigned value and checks if its value is at least [index of directory + 1] when checking for the existence of a data directory, so no error occurs there. Disassembling: - lots of garbage instructions make it hard to see the real instructions The garbage is easy to see, though, since it is a mostly prefixes on instructions that don't support them (eg rep mov bl, 98h). - lots of overlapping instructions make it hard for some disasssemblers This is generally not a problem for IDA Pro, though, but some disassemblers will not separate two instructions if a label appears within them. ie 00DE2203 F2 EB 01 repne jmp short near ptr loc_DE2206+1 00DE2206 22 F2 and dh, dl ; CODE XREF: start+203?j 00DE2208 64 F3 C6 C3 98 rep mov bl, 98h will be displayed, instead of 00DE2203 F2 EB 01 repne jmp short loc_DE2207 00DE2206 22 db 22h 00DE2207 loc_DE2207: ; CODE XREF: start+203?j 00DE2207 F2 64 F3 C6 C3 98 rep mov bl, 98h Tracing: - some less-common instructions (CPUID, RDTSC) are used which can interfere with some CPU emulators - Structured exception handling is used to transfer control, which interferes with many debuggers This is a problem particularly for user-mode debuggers, which tend to break on illegal actions. There is also a calculation of the time taken between the exception occurring and the execution of the SEH handler. This will interfere with debuggers such as Soft-Ice, and users who change the instruction flow to force an exception to occur that their debugger allows (eg TD32 traps illegal memory access, but allows illegal instruction exceptions to occur). However, due to a what appears to be a bug, if the maximum allowed time is exceeded, another exception occurs and the handler is called in an infinite loop, consuming 100% of the CPU time. This behaviour was visible occasionally in a Virtual PC session. This occurs because the CPUID instruction destroys a register that is used by the exception handler code. Additionally, on Windows NT and Windows 2000, the program is terminated silently after the first exception occurs. This occurs because the CPUID instruction destroys a register that the Windows NT/2000 SEH handler itself uses. Windows XP explicitly preserves the needed registers across SEH calls. - the APIs that are used (msvcrt!printf, kernel32!GetTickCount, kernel32!GetCommandLineA, kernel32!ExitProcess) have their first four bytes scanned for the presence of an INT 3 breakpoint instruction. The INT 3 opcode (CC) is disguised as the value 660h, which is then shifted right three times. If one is found, the program will use the value returned by RDTSC as an address from which to continue execution, leading to an exception which causes the program to be terminated silently. - many (175) layers of encryption hide the strings that are displayed and the rest of the code - a structure of function pointers is used instead of calling the routines directly to perform the decryption of each layer The first structure member is the RVA of the handler that assigns the address of the encrypted buffer. The second structure member is the obfuscated RVA of the handler that moves to the next state. The third structure member is the RVA of the handler that simply returns. The fourth structure member is the obfuscated RVA of the handler that moves to the next state. The fifth structure member is the RVA of the handler that assigns the size of the encrypted buffer. The sixth structure member is the obfuscated RVA of the handler that moves to the next state. The seventh structure member is the RVA of the handler that decrypts the encrypted buffer. The eighth structure member is the obfuscated RVA of the handler that moves to the next state (moves to next layer). - transfer of control is also done using add-relative values This appears as CALL -> ADD [ESP], value -> RET, with branching in some cases to a location more than a hundred kilobytes away. - after each layer is decrypted, there is a check for a breakpoint on the instruction that transfers control to the newly-decrypted code - a very cool p-code language that checks the command-line for the correct code (see below) 2. Something uncommon has been used to protect the code from beeing reverse engineered, can you identificate what it is and how it works? - what Nico calls a Virtual CPU, a byte-code language that is used to perform the checks of the command-line, decrypt the code, etc It consists of a sequence of opcodes - a group number, followed by a function number - followed by a list of parameters. group function parameters description 00 00 dword PUSH dword 01 dword PUSH dword (VA) 02 byte PUSH register[byte] 03 byte POP register[byte] 04 byte POP stack by [byte] 01 00 JMP SPECIAL and pop stack 01 EXCPT 02 JMP SPECIAL and pop stack 03 byte1, 00, byte2 XOR byte memory[register[byte2]], register[byte1] byte1, 01, byte2 XOR word memory[register[byte2]], register[byte1] byte1, 02, byte2 XOR dword memory[register[byte2]], register[byte1] 04 byte, dword ADD register[byte], dword 05 byte, dword SUB register[byte], dword 06 byte, dword AND register[byte], dword 07 byte, dword OR register[byte], dword 08 byte, dword XOR register[byte], dword 09 byte1, byte2 ADD register[byte2], register[byte1] 0A byte1, byte2 CMP register[byte2], register[byte1] 02 00 EXIT (calls ExitProcess()) 01 dword CALL [dword], store result at register[0] (import) 02 dword, byte MOV register[byte], dword 03 byte DEC register[byte] 04 byte INC register[byte] 05 byte MOV register[byte], 0 (actually, AND is used, but the result is the same) 06 byte, word REPNE SCASB (EAX=byte, ECX=word, EDI=register[0], store result at register[0]) 07 BREAK 08 byte1, byte2 MOV register[byte2], dword memory[register[byte1]] 09 byte BSWAP register[byte] (bug: result is discarded) 0a byte1, byte2 MOV register[byte2], byte memory[register[byte1]] 0b byte1, byte2 MOV register[byte2], word memory[register[byte1]] 03 00 CMP stack[0], stack[1] 04 00 dword JMP dword 01 dword JNE dword 02 dword JE dword 05 00 dword CALL dword (local proc) 01 RET 3. Provide a means to "quickly" analyse this uncommon feature. - in this case, a debugger is probably the best way to analyse this feature, since there are more layers of encryption here It is not necessary to understand how it works, given that the most important instructions are easy to see: ADD/SUB/INC/DEC and the CMP. Everything else can be ignored. By tracing through the code that transforms the command-line values into their encrypted form for comparison, it is easy to see what the required values are. - another method would be to disassemble the p-code engine, if the decrypted body can be dumped from memory In that case, it is necessary to understand how to skip the garbage code to find the real instructions. - in the case of a binary using only this Virtual CPU, a disassembler for it is really possible, given the documentation above That would be the best solution in that case. However, once I got used to how the engine worked, I was able to step through the code very quickly. 4. Which tools are the most suited for analysing such binaries, and why? - for binaries such as the challenge itself, CPU emulators are the best for that. They can be customised to avoid the uninteresting instructions. They can break on memory accesses. They can handle the SEH tricks without breaking. - for binaries that use only this Virtual CPU, a disassembler is the best for that (see attached disassembler) 5. Identify the purpose (fictitious or not) of the binary. - To display this message, if the correct command-line is used: Welcome... Exploit for it doesn't matter 1.x Courtesy of Nicolas Brulez 6. What is the binary waiting from the user? Please detail how you found it. - the command line which is any one of: "1D3NGBaM", "1D3nGBaM", "1D3NGCaL", "1D3nGCaL", "1D3NCCeL", "1D3nCCeL" I found it using my private CPU/OS emulator (Atlantis, see http://pferrie.tripod.com for a description). The obfuscation code has a weakness that the garbage sequences are wrapped in PUSHAD+POPAD instructions, so I customised my emulator to skip them. It always handles SEH without any trouble, and I already reverse-engineered the Windows PE loader so my emulator could also load the files without any problem. I let the program run until I saw an API called. I traced from there and saw it store the address of the command-line at a certain address. I set a breakpoint on any access to this new address. Running to that point bypassed entirely the 175 encryption layers, and landed close to the start of the p-code. Stepping through the p-code was easy enough because it is very short, and I had only to see what values were compared, then to adjust my values appropriately to match. Eventually, I went back and studied the code in detail. Finally, I wrote a disassembler and used that. Here is the commented disassembly: Bruit - Disassembler for the Brulez VCPU, by Peter Ferrie 2004 // Decrypt command-line parameter verification code 00e1ba3d MOV R3, 0000020e 00e1ba44 PUSH 00e1ba65 00e1ba4a POP R0 00e1ba4d MOV R2, 00000053 00e1ba54 XOR b[R0], R2 00e1ba59 INC R0 00e1ba5c DEC R3 00e1ba5f JNE 00e1ba54 // Initialise key register 00e1ba65 MOV R1, 5cc80e31 // (10) // Call GetCommandLineA() 00e1ba6c CALL [00de100d] // Search command-line for space character 00e1ba72 SCAN [R0], 20, 0255 // Exit if no space found, else virtual address is returned in R0 00e1ba77 JE 00e1bb6f // Get first four characters of command-line and begin verification 00e1ba7d MOV R2, d[R0] // (20) 00e1ba81 ADD R2, 1d9bdc45 // (21) 00e1ba88 ADD R1, 74519745 // (11) 00e1ba8f SUB R2, ad45dfe2 // (22) 00e1ba96 ADD R1, deadbeef // (12) 00e1ba9d ADD R2, 68656c6c // (23) 00e1baa4 SUB R1, 17854165 // (13) 00e1baab SUB R2, 41776169 // (24) 00e1bab2 ADD R1, 73686f77 // (14) 00e1bab9 ADD R2, 69747320 // (25) 00e1bac0 SUB R1, 206e6f20 // (15) 00e1bac7 ADD R2, 64726976 // (26) 00e1bace ADD R1, 6d657263 // (16) 00e1bad5 SUB R2, 6e757473 // (27) 00e1badc SUB R1, 79212121 // (17) 00e1bae3 SUB R2, 65683f21 // (28) 00e1baea AND R2, dfffffff // (29) // Compare the two values and continue only if they are equal 00e1baf1 PUSH R2 00e1baf4 PUSH R1 00e1baf7 CMP STK[0], STK[1] 00e1baf9 JE 00e1bb06 // Otherwise, set the Z flag to force the branch to be taken (equivalent to JMP, but obfuscated a bit) 00e1bafd AND R3, 0 00e1bb00 JE 00e1bb6f /* Now we combine the elements to yield the simultaneous equations to solve: (10) + (11) + (12) - (13) + (14) - (15) + (16) - (17) = X and ((20) + (21) - (22) + (23) - (24) + (25) + (26) - (27) - (28)) & (29) = X we fill in the values: 5CC80E31 + 74519745 + DEADBEEF - 17854165 + 73686F77 - 206E6F20 + 6D657263 - 79212121 = DF807499 and (C + 1D9BDC45 - AD45DFE2 + 68656C6C - 41776169 + 69747320 + 64726976 - 6E757473 - 65683F21) & DFFFFFFF = DF807499 There are two values for C, because of the & DFFFFFFF, and they are "1D3N" (4E314431) and "1D3n" (6E334431) (Actually, there are more, but it requires the use of 8-bit characters) */ // Skip the first four characters 00e1bb06 INC R0 00e1bb09 ADD R0, 00000002 00e1bb10 INC R0 // Get next two characters of command-line and continue verification 00e1bb13 MOV R1, w[R0] 00e1bb17 PUSH R1 00e1bb1a POP R2 // Save current position in command-line 00e1bb1d PUSH R0 // Point to particular p-code location 00e1bb20 PUSH 00d8360d 00e1bb26 POP R0 00e1bb29 ADD R0, 00098548 00e1bb30 DEC R0 // -> e1bb54 /* Decrypt opcode at 00e1bb54 using the two command-line characters as the key The initial values at that location are "GB" (47 42) , followed by 03 00 02 44 By checking the p-code documentation, we can see what values are possible replacements, which can only be something that takes a single dword as parameter Therefore, they are: 00 00 (PUSH dword), 00 01 (PUSH dword), or 04 01 (JNE) This last one works because the virtual Z flag happens to be clear, otherwise 04 02 (JE) would be the alternative This leaves us with the values "GB" (47 42), "GC" (47 43), "CC" (43 43) So far, we have a command-line of any of: "1D3NGB", "1D3nGB", "1D3NGC", "1D3nGC", "1D3NCC", "1D3nCC" */ 00e1bb33 XOR w[R0], R2 // Restore current position in command-line 00e1bb38 POP R0 // Combine bytes 4 and 6 of the command-line 00e1bb3b MOV R2, b[R0] // (20) 00e1bb3f ADD R0, 00000002 00e1bb46 MOV R1, b[R0] 00e1bb4a ADD R1, R2 // (21) // Call subroutine to verify the value (does not return if mismatch) 00e1bb4e CALL 00e1bb86 // Call printf() to display "Welcome...\r\nExploit for it doesn't matter 1.x Courtesy of Nicolas Brulez" 00e1bb54 PUSH 731b5412 00e1bb5a PUSH 00e1bc2a 00e1bb60 CALL [00de1001] 00e1bb66 POP 8 00e1bb69 JMP 00e1bb84 // Call printf() to display "Please Authenticate!" 00e1bb6f PUSH 00000000 00e1bb75 PUSH 00e1bc13 00e1bb7b CALL [00de1001] 00e1bb81 POP 8 // Exit program 00e1bb84 EXIT // Perform obfuscated assignment of value to compare 00e1bb86 MOV R1, 0000004c // (10) 00e1bb8d INC R1 // (11) 00e1bb90 INC R1 // (12) 00e1bb93 ADD R1, 00000005 // (13) 00e1bb9a DEC R1 // (14) 00e1bb9d SUB R1, 00000004 // (15) 00e1bba4 SUB R2, 0000005a // (22) // Compare the two values and continue only if they are equal 00e1bbab CMP R1, R2 00e1bbaf JNE 00e1bb6f /* Now we combine the elements to yield the simultaneous equations to solve: (10) + (11) + (12) + (13) - (14) - (15) = X and (20) + (21) - (22) = X we fill in the values: 4C + 01 + 01 + 05 - 01 - 04 = 4E and C1 + C2 - 5A = 4E We know from above that C1 can be only "G" (47) or "C" (43), so we C2 can be only "a" (61) or "e" (65), respectively Now we have a command-line of any of: "1D3NGBa", "1D3nGBa", "1D3NGCa", "1D3nGCa", "1D3NCCe", "1D3nCCe" */ // Combine bytes 5 and 7 of the command-line 00e1bbb5 DEC R0 00e1bbb8 MOV R2, b[R0] // (20) 00e1bbbc ADD R0, 00000002 00e1bbc3 MOV R1, b[R0] 00e1bbc7 ADD R2, R1 // (21) // Perform obfuscated assignment of value to compare 00e1bbcb INC R2 // (22) 00e1bbce SUB R2, 0000004e // (23) // Point to particular p-code location 00e1bbd5 PUSH 00e0dd64 00e1bbdb POP R0 00e1bbde ADD R0, 0000deac 00e1bbe5 INC R0 // -> e1bc11 /* Decrypt opcode at 00e1bc11 using the command-line character as the key The initial value at that location is "G" (47) , followed by 01 By checking the p-code documentation, we can see what values are possible replacements, which can be only something that takes no parameters Therefore, it is: 05 (RET) This leaves us with the value "B" (42) Now we combine the elements to yield the key: (20) + (21) + (22) - (23) = X we fill in the values: C1 + C2 + 01 - 4E = 42 We know from above that C1 can be only "B" (42) or "C" (43), so C2 can be only "M" (4D) or "L" (4C), respectively Now we have a command-line of any of: "1D3NGBaM", "1D3nGBaM", "1D3NGCaL", "1D3nGCaL", "1D3NCCeL", "1D3nCCeL" */ 00e1bbe8 XOR b[R0], R2 // Also decrypt "Welcome" text using command-line character as the key 00e1bbed MOV R3, 00000049 00e1bbf4 PUSH 00e1bc2a 00e1bbfa POP R0 00e1bbfd INC R2 00e1bc00 XOR b[R0], R2 00e1bc05 INC R0 00e1bc08 DEC R3 00e1bc0b JNE 00e1bc00 00e1bc11 RET 00e1bc13 db "Please Authenticate!\r\n" 00e1bc2a db "Welcome...\r\nExploit for it doesn't matter 1.x Courtesy of Nicolas Brulez" 00e1bc73 END // Done! Our full command-line is therefore any one of: "1D3NGBaM", "1D3nGBaM", "1D3NGCaL", "1D3nGCaL", "1D3NCCeL", "1D3nCCeL" -- Funny strings that I found in the code: "HAX0" "Evil Has No Boundaries !" 31337 "hellAwaiits " ("hell Awaits") "show no mercy!!!" "drivnutseh?!" ("drive nuts, eh?!")