Analysis
Introduction
Rough information on the file
Going into the details
Automation of the tasks
The "p-code" engine
Breaking the password protection
Conclusion
Answers
-- $ md5sum 0x90.exe 7daba3c46a14107fc59e865d654fefe9 *0x90.exe --A file format analyzer will inform us on the the nature of the file :
-- $ file 0x90.exe 0x90.exe: MS-DOS executable (EXE), OS/2 or MS Windows --Thus we are now sure that the file is indeed a Windows executable (Portable Executable [1]), and should gather some more information about its structure and attributes. To fulfill this, I will use a PE editor [2] :
DATA
) leads us to think that the
program was modified in some way. E0000020
is
IMAGE_SCN_CNT_CODE | IMAGE_SCN_MEM_EXECUTE | IMAGE_SCN_MEM_READ |
IMAGE_SCN_MEM_WRITE
, meaning the section contains code, can be read,
written and executed. C0000040
is
IMAGE_SCN_CNT_INITIALIZED_DATA | IMAGE_SCN_MEM_READ |
IMAGE_SCN_MEM_WRITE
meaning the section contains initialized data, can be
read and written : it is the typical layout of a packed executable.-- $ strings 0x90.exe MZP This program must be run under Win32 DarthPE CODE DATA NicolasB .idata [...] msvcrt.dll KERNEL32.dll printf GetTickCount GetCommandLineA ExitProcess You really thought you would find strings eh? ;-) Scan of the month coded by Nicolas Brulez / Digital River --We won't learn much more with that, except that it indeed looks crippled and has only a few visible linked DLLs and imports (described in details on figure 2 thanks to [4]).
-- $ ./0x90 Please Authenticate! --Trying several parameters wont make any changes.
-- 00DE22E3 33DB XOR EBX,EBX 00DE22E5 8F03 POP DWORD PTR DS:[EBX] --Since the address 0x00000000 isn't in a writeable address space, the write error was likely to happen and was done on purpose. A few commands before, the author has taken care of installing his own exception handler thanks to those lines :
-- 00DE22D7 33C0 XOR EAX,EAX 00DE22D9 64:FF30 PUSH DWORD PTR FS:[EAX] 00DE22DC 64:8920 MOV DWORD PTR FS:[EAX],ESP --When a debugger is attached to a process, it will catch an occuring exception, thus preventing the execution flow to follow the course envisaged by the author. It is a very common anti-debugging trick that can be dealt with quickly in OllyDbg : check the box 'Memory access violation' (see figure 6) in the dialog described earlier and all these kind of exceptions will now be passed directly to the program.
-- 00DE22DF 0FA2 CPUID 00DE22E1 0F31 RDTSC --The Intel manual says about
RDTSC
, "This instruction loads the
current value of the processor’s time-stamp counter into the EDX:EAX registers.
The time-stamp counter is contained in a 64-bit MSR. The high-order 32 bits of
the MSR are loaded into the EDX register, and the low-order 32 bits are loaded
into the EAX register. The processor increments the time-stamp counter MSR every
clock cycle and resets it to 0 whenever the processor is reset." The
RDTSC
instruction is often used in a benchmarking context, but can
also be used in software protection to check for the time elapsed between two
portions of code. If it is too high, the program is likely being debugged. Let's
dig into that. In the exception handling function for the current context, we
find :
-- 00DE22B9 0FA2 CPUID 00DE22BB 0F31 RDTSC 00DE22BD 2B0424 SUB EAX,DWORD PTR SS:[ESP] 00DE22C0 83C4 04 ADD ESP,4 00DE22C3 3D 00000E00 CMP EAX,0E0000 00DE22C8 77 03 JA SHORT 0x90.00DE22CD --So if there is more than 0xE0000 clock cycles between the first
RDTSC
and the second one in the exception handler, something
happens : just try, the program crashes. In order to avoid this, set a condition
to stop trace (shortcut ctrl-T), check 'Command is one of' and add CMP
EAX,0E0000
in the edit box (see figure 7).
EAX
to 0 and run a trace again. On a
convenient purpose, we will add the breakpoint on the entry point of the
program :
-- 0x90.osc -- bp 0DE2000 // break at EP run bc eip // clear the breakpoint loop: ti // trace into - I am not sure how to use ticnd for my purpose mov eax, 0 // reset EAX jmp loop // and loop --After running that script, OllyDbg does what it is supposed to do, but then traces a lot without breaking. How come ? Let's pause the trace (shortcut F12), and check in the trace log what takes so long. It takes a second to see that we landed into a XORing loop with a pretty high number of iterations, that corresponds to a layer of decryption of the binary.
-- 00E2635B 3008 XOR BYTE PTR DS:[EAX],CL 00E2635D 40 INC EAX 00E2635E 49 DEC ECX 00E2635F 85C9 TEST ECX,ECX 00E26361 ^75 F8 JNZ SHORT 0x90.00E2635B --The number of commands that OllyDbg can process when running a trace is quite low [7], which explains why it takes so long. Nonetheless, we can quickly get the address of the last comparison before the
XOR
:
0xDE4BDC.POPAD
and
PUSHAD
. Thanks to a small perl
script, we can extract the core
instructions of this part. Basically, what is done here is a check on
the 4 first bytes of 3 addresses for the presence of the byte
0xCC
. This byte - opcode for the INT3
instruction - is used by debuggers to set a breakpoint on a command. These
addresses (0x7C812C8D
, 0x77C1186A
,
0x7C81CAA2
) are respectively the ones of the APIs
GetCommandLineA
, printf
and ExitProcess
.
Once again, this is a common anti-debugging trick : one can't put a breakpoint
on the API or the program will crash. An easy way to counter that is to put the
breakpoint more than 4 bytes away from the beginning of the function, what we
will do from now :) (including the GetTickCount
one)JNZ
, run, and trace
after you reached the breakpoint and you get directly to the next CMP
EAX,0E0000
at 0xE263B2, but our automation scheme falls appart. Moreover,
looking at the XOR
mechanism, we see that it decrypts the code to
be executed in a few cycles, another anti-reverse technique preventing from
doing whatever we want with the process.CMP EAX,0E0000
, the next one has
already been decrypted by the XOR loop. So our new method will imply searching
the memory for the next sequence 3D00000E00
(being CMP
EAX,0E0000
), setting a breakpoint on the address, running, set
EAX
to 0 when we reach the breakpoint, clear the breakpoint and
loop. Since we can't search backward, we will start 400 bytes before actual
EIP
and search forward. We will also adapt this method to the first
part of the code :
-- 0x90.osc -- bp 0DE2000 run bc eip var count mov count, 0 part1: inc count findop eip, #3D00000E00# bp $RESULT run bc eip mov eax, 0 cmp eip, 0DE4BDC jb part1 log count bp 0E263B2 run bc eip mov eax, 0 var address mov count, 0 part2: inc count mov address, eip sub address, 400 findop address, #3D00000E00# cmp $RESULT, eip je end2 bp $RESULT run bc eip mov eax, 0 jmp part2 end2: log count --The last address matching our criterias is 0xE1C4B7. In addition we now know that there are 8 blocks for the first part, and 175 for the second one (meaning at least 175 XOR decryption layers). Tracing from there (with the usual condition), we find two more comparisons at 0xDE8692 and 0xDE86FF for which with have to clear
EAX
again.
0x02 0x02 0x0A 0x15 0x02 0x51 0x03
:
; First index is 0x02 00DE9C9E Main MOVZX EAX,BYTE PTR DS:[ESI] ; EAX=00000002 [...] ; Table is 0xE1B9F5 00DE9F5D Main MOV EDI,DWORD PTR DS:[EAX*4+E1B991] ; EDI=00E1B9F5 [...] ; Second index is 0x02 00DEA213 Main MOVZX EAX,BYTE PTR DS:[ESI+1] ; EAX=00000002 [...] ; Task address is 0xDFDCA3 - this code is located in an exception handler 00DEA4D2 Main MOV EAX,DWORD PTR DS:[ECX+B0] ; EAX=00000002 00DEA4D8 Main MOV EDI,DWORD PTR DS:[ECX+9C] ; EDI=00E1B9F5 00DEA4DE Main MOV EAX,DWORD PTR DS:[EDI+EAX*4] ; EAX=00DFDCA3 [...] ; First argument is 0x5102150A - 4 bytes - this code is located in the task 00DFDCAD Main MOV EAX,DWORD PTR DS:[ESI+2] ; EAX=5102150A [...] ; Some calculation is done here 00DFDF5E Main ADD EAX,AEFDED04 ; EAX=0000020E [...] ; Second argument is 0x03 - 1 byte 00DFE20E Main MOVZX EDI,BYTE PTR DS:[ESI+6] ; EDI=00000003 [...] ; Result is saved in a table 00DFE4DC Main MOV DWORD PTR DS:[EDI*4+DE8736],EAX ; Command is 7 byte long, go to next command in sequence 00DFE4E3 Main ADD ESI,7 ; ESI=00E1BA44The aim of the first part of the sequence of bytes is to decrypt the last part of that same sequence, which will be executed later.
Index | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | |
00E1B9A9 | 0 | 00DEA4FA | 00DEC36A | 00E173B7 | 00E1B0C1 | 00E14CC1 | |||||||||
00E1B9BD | 1 | 00DF1A7B | 00DF442F | 00DF59D2 | 00DF74EF | 00E181A6 | 00E1A05B | 00E1226D | 00E14128 | 00E016ED | 00E05353 | 00DFB948 | 00DF7FCB | 00DF9048 | 00DFA319 |
00E1B9F5 | 2 | 00DEE16F | 00DEE947 | 00DFDCA3 | 00E09AB5 | 00E0D4BC | 00E11477 | 00E15B11 | 00E1B92B | 00E0E816 | 00E0C6C9 | 00E0B5E7 | 00E03D6A | ||
00E1BA25 | 3 | 00DFEFDC | |||||||||||||
00E1BA29 | 4 | 00E034FD | 00E07989 | 00E0F8A7 | |||||||||||
00E1BA35 | 5 | 00E13604 | 00E15554 |
GetCommandLineA
which happen to be called from 0xDEF94D.
GetCommandLineA
). Let's detail the checks.REPNE SCAS BYTE PTR ES:[EDI]
at offset
0xE1658B. The part of the command line located after the space will be refered
to hereafter as the password. Then the four first bytes of the password are
loaded as a 32 bit number, I will call it X, that undergoes several arithmetical
operations to give a number Y:
X + 0x1D9BDC45 - 0xAD45DFE2 + 0x68656C6C - 0x41776169 + 0x69747320 + 0x64726976 - 0x6E757473 - 0x65683F21 = Ythat can be summerize to :
X + 0x914D3068 = YConcurrently, another number Z is calculated :
0x5CC80E31 + 0x74519745 + 0xDEADBEEF - 0x17854165 + 0x73686F77 - 0x206E6F20 + 0x6D657263 - 0x79212121 = 0xDF807499 = ZAt offset 0xDFF51B (this script will lead you there), a check verifies that Y is equal to Z meaning that
X +
0x914D3068 = 0xDF807499
, and so X = 0x4E334431
. The first
four characters of the password are then '1D3N'
.C + E - 0x5A = 0x4Eso the first condition on the second part of the password is
E = 0xA8 -
C
. We can update our password accordingly and continue tracing. Then
a value X is generated according to the following formula :
(D + F + 1 - 0x4E) ^ 0x47 = XThis byte X is in fact the first byte of a p-code instruction, the second byte being 0x01. According to our table, this byte is such as
0 <= X <=
5
. Furthermore, we know that the command will not have any parameter
since it is located at the end of the sequence before the 'Please Authenticate!'
string, and that it must change ESI
for execution flow to continue.
After a few tries, we find that for X = 5
, the task executes a
POP ESI
(at 0xE1555E) without taking any parameter and then
perfectly fits our needs :) The second condition that has to be matched is
F = (0x47 ^ 0x05) + 0x4E - 1 - D
, in short F = 0x8F -
D
.ESI
) now points to a memory location (OxE1BB54) looking
like [Y Z 0x03 0x00 0x02 0x44 ...]
with Y = C ^
0x47
and Z = D ^ 0x42
. We now have to find C and D so that
this part of the sequence will contain valid p-code opcodes. This file should be compiled and run to generate the
passwords for all the couples (X, Y)
. One solution would be to try
them all, another one would be to look a bit closer at the p-code sequence and
see that the length of the opcode is apparently 3 with a 1 byte parameter
0x03
, and look into the trace for an opcode looking like that. We
observe that the couples (2, 3)
and (2, 4)
verify this
(as well as the previously unused one (2, 9)
).-- $ ./0x90 1D3NEKcD Welcome... Exploit for it doesn't matter 1.x Courtesy of Nicolas Brulez --When all checks are passed, the program does an
ExitProcess
with 0
as error code, usually meaning everything went well. We are done :)
JMP
and CALL
instructions, inclusion of unexecuted bytes, as well as long portions of code
that will no effect on the overall execution flow. On can find the "core"
instructions between commands POPAD
and PUSHAD
or
equivalent. XOR EBX,EBX
POP DWORD PTR DS:[EBX]
RDTSC
instruction twice (in the "regular" code, then in the
exception handler), the program calculates the number of CPU clock cycles
elapsed between the two instructions. If that number is too high, the program is
likely being debugged and must be terminated - usually in a dirty way in order
to confuse the analyst.RDTSC
checks have been added in each
layer).
strcmp()
, but uses more advanced schemes
including code generation from parts of the given string. It will be detailed
hereafter.GetCommandLineA
and undergoes a series of reversible
checks in 2 parts. The first part deals with the first 4 characters of the
password treated as a 32-bit number (dword), the second part deals with the next
4 characters treated as 4 8-bit numbers (byte). The first half of the password
can be determined uniquely, whereas some p-code opcodes are generated from the
second half as explained in the analysis, introducing several possibilities.
Finally, a working password (I insist on the fact that it is not unique) can be
calculated and used to launch the exploit :CMP EAX,0E0000
I used to
circumvent the clock cycles count check.