Analysis

Tools used

In this analysis I've used the following tools, which are all free (as in beer) for trial purposes:
- IDA 4.2.1 evaluation version (wouldn't it be nice to own a fully working version *hint* ;)).
- VMware workstation to create a safe sandbox system in which I could observe the binary in the wild.
- GDB, debuggers are always one of the most powerful tools in reverse engineering, someone go and port softice to unix!
- Biew to do some hex editing and making some changes to the-binary to make it easier to debug it
- Strace a very powerful reverse engineering tool that will allow you to view system calls as they occur
- Ethereal my favorite packet dumper
- Sendip an util to send raw IP packets from the command line

Please note that the tool fenris also looks very promising and interesting and might have made my work a lot easier but I did not know of its existence when I did my analysis.

Getting started

The webpage reads:
This is an un-trusted tool developed and used by the blackhat community, do not use a production system to analyze it, nor any system with a connection to a production network.
I took this warning serious and because I did not have a spare computer laying around to do my analysis on I used VMware to solve this problem. After getting the trial version and trial key I installed a clean debian system on it, fetched the-binary, checked its MD5, installed the tools I wanted to run on the system (gdb,strace,biew) etc using apt-get.
After the installation was done I shut down the system and configured it so that it used a 'host-only' network connection and could not get onto the internet or my local network using my system. I could now run ethereal on the host system to capture packets sent out by the infected OS. And use ssh on the host system to log into the infected system.

Starting the analysis

Now we have a nice sandbox where we can play with the-binary all we want and it can't do any harm to our system or network. Let's use strace to find out what the-binary really does. I fired up the-binary using strace -f -i (follow forks and print eip). We see that it calls geteuid (to see if its running as root), then two rapid forks (probably used to fool some debuggers) it then chdirs to /, closes all its filehandles. So far nothing really interesting. The interesting part starts when it creates a RAW socket for IP protocol 11. According to protocol lists this is the NVP or network voice protocol. A protocol that is by default ignored by most IDS systems. After that the program hangs on a recv on this socket. In other words it is waiting for traffic using ip protocol 11. So let's feed it some, I used the tool sendip to send it a string of 100 A's. This simply puts it back to the recv statement to receive more data. its now time to use the most powerful tool in reverse engineering history: IDA. Of course our hacker applied some tricks so that IDA does not recognize the names of the libc functions anymore. But an experienced reverse engineer can easily re-identify most of them (when i was done i saw there even was signature file for this version of libc released that can identify all libc functions in the-binary). When we apply this method to all other functions called so far we get a better view of the binary, let's look at the interesting part:
.text:080482B0                 push    0
.text:080482B2                 push    800h
.text:080482B7                 lea     eax, [ebp+var_800]
.text:080482BD                 push    eax
.text:080482BE                 mov     ecx, [ebp+var_44C8]
.text:080482C4                 push    ecx
.text:080482C5                 call    recv            ; receive command
.text:080482CA                 mov     esi, eax
.text:080482CC                 add     esp, 10h
.text:080482CF                 mov     edx, [ebp+var_44D0]
.text:080482D5                 cmp     byte ptr [edx+9], 0Bh ; check if it is protocol 0xb
.text:080482D9                 jnz     loc_8048EB8     ; if not receive again
.text:080482DF                 mov     ecx, [ebp+var_44D4]
.text:080482E5                 cmp     byte ptr [ecx], 2 ; check if first byte is 0x02
.text:080482E8                 jnz     loc_8048EB8     ; receive again
.text:080482EE                 cmp     esi, 0C8h       ; check if length is at least 0x0cb (200) bytes
.text:080482F4                 jle     loc_8048EB8     ; receive again
We can conclude the message should start with a 0x02 and has to be at least 200 bytes long, otherwise it is simply ignored. So let's fire up sendip again and send it a message of over 200 bytes starting with 02. I sent it 02 300 times now. Still no luck strace shows it simply going back into the receive loop again. Let's see what the-binary does with our received command after it the first 3 checks.

The decryption routine

We see it feeds our message to a sub at 804A1E8. This turns out to be a decryption function that is in essence very simple, yet programmed in a very stupid way. If you are interested in an analysis of decryption routines read one of the 100's of essays on key generators which are very similar to encryption/decryption routines. The algorithm used can be expressed using the following piece of perl:
for($x=0;$x<length($enc)-1;$x++) {
  $dec .= chr((ord(substr($enc,$x+1,1))-ord(substr($enc,$x,1))-0x17) % 256);
}
In other words: d[x] = (e[x+1]-e[x]-0x17) mod 256. Where e is the encrypted string and d is the decrypted string. After the decryption it checks the second byte of the decrypted string and according to it executes a different piece of code, in other words there are 12 different types of packets. Which we will analyze one by one below.

Packet type 0

When we send a packet of type 0 to the-binary and look at the strace output we can see that it sends some data over the network to 0.0.0.0 and then goes back into the receive loop to receive its next command. We can have a look at the data it sends by using strace -xx -s 50000 -f which will cause strace to output the data the-binary sends in hexadecimal form. When we decode this data using the same algorithm the program uses to decode its commands we see the decoded packet is the entire encrypted packet we have send it with some 0's and garbage appended. So packet type 0 appears to be a ping packet for the blackhat to see if the-binary is running. The destination address 0.0.0.0 can be changed as we will see later on (when we analyze packet type 1).

Packet type 1

In IDA we can see that packet type 1 gets a byte and then 4 bytes from the beginning of the message and writes these values to internal variables in the program. (i.e. it sets some internal variables according to the received data). The first thing that came to my mind was that these could be the 4 bytes of the ip address. A simple test can prove this: send ABCDEFGHIJKM followed by a lot of 0's in a packet type 1 and then send a type 0 packet. In strace we now see it sends the packet to 10 different hosts one of them being 66.67.68.69 (BCDE). When we review the rest of the sub in IDA we derive the following piece of pseudo code (when we call the first receive byte decoymode): check if decoymode == 2 if so, add 10 addresses from the message else add 10 random ip addresses and put the first ip in the message on a random position within these 10. These decoy ips are used to obscure the real source ip of the attacker between other ip addresses.

Packet type 2

Using the string references that IDA generates we can easily see the purpose of this type of packet, it executes a command puts its stdout and stderr in a file called /tmp/.hj237349 then reads that file encrypts it contents , sends it back to the ip set using packet 1 and then removes the temporary file. Please note that the packet contains alot of garbage data at the end that unfortunately for the attacker also contains unencrypted data so that his packets still look suspicious.

Packet type 3

What we see here is something we will see again in other subs, it first compares a value with 0 if it is not 0 then it jumps back into the receive loop, if it is zero it forks off and puts the pid of the child in that variable. (We will later see this pid is used in packet 7 to stop forked child processes). After forking off it calls a sub with parameters from the received command. The sub uses the first 4 bytes of the message to set an ip address or the last part of the message as a host named for the target (spoofed source of the attack) a boolean is used to indicate whether to use the ip or the hostname. This sub does a domain lookup on a part of the message if it fails it sleeps for 10 minutes and then tries again. If it is successful it starts spitting out a lot of packets. Ethereal can capture these packets, and decodes them as dns lookups for .com,.net etc (a request that takes a packet of about 60 bytes to send and gets back a lot more). The source port is also configureable from the command. These kinds of attacks are documented in this CERT Incident Note and reveal the-binary's real purpose: DDoS attacks.

Packet type 4

In IDA we see that this packet is handled in a similar way as packet type 3. A lot of bytes from the message are being put on the stack and a pointer to the rest of the message and then a sub is called. Which does a hostname lookup and then starts creating packets. The difference here is that the type of attack is different. Here it sends out fragmented udp packets with a spoofed source address (this address is in the received message).

Packet type 5

In strace we see when a packet type 5 is received it opens up a listening socket at port 23281. And waits for someone to connect to it. When we use netcat to connect to this port, we see it reads our input and then sends back some garbage exits and listens again. Let's have a look in IDA what it is expecting us to send. In IDA we see again one of the hacker's brilliant encryption schemes at work (actually this is a scheme invented by Caesar). It takes the input, replaces every letter by the next letter in the alphabet and then checks if the first 6 letters equal TfOjG (so the string to send is: SeNiF) when we send SeNiF to port 23281 it binds a root shell to the port which we can then use over our netcat connection. If we send something else it sends 0xFF 0xFB 0x01 0x00.

Packet type 6

This packet does the same thing as packet type 2, only it does not send the output back. So it is used to execute a single command. This could be used for example to install a new version remotely.

Packet type 7

This mode reads the pid stored in the internal variable set by the DoS and root shell modes and then calls the kill command, to stop the attacks.

Packet type 8

This packet is very similar to 3, only difference is that it passes slightly different parameters to the DNS flood sub routine. This function seems to make packet 3 redundant as this packet can do everything packet type 3 can do as well.

Packet type 9

Also similar to number 3 only starts a different type of flood, this time a SYN flood. As is describe in this CERT advisory.

Packet type A

Similar to 9, only passes slightly different parameters on the stack.

Packet type B

Again the same as 3.

Why some attack modes have multiple commands

One might wonder why for example the DNS DoS attack can be called using different commands, the answers appears to be that these modes only differ in the speed of the attack (mode 3 uses a fixed speed and mode 8 allows the speed to be set). The speed is expressed in 100 requests per second. So speed 6 means 600 requests per second.

Conclusion

The-binary is a tool that can be used by the attacker to takeover your system and execute a few different DoS attacks. While some of the code in the-binary is well written, a lot of code is not. For example the implementation of the decryption routine, by reading the string backwards and using a lot of printf calls is inefficient to say the least. In other words a tool to be aware of but nothing really new here. Tools like stacheldraht and tfn2k have been around for a pretty long time and contain almost identical functionality.

General reverse engineering tricks

Now the-binary is analyzed let's discuss some of the tricks I used when creating this analysis.

Saving your work in IDA evaluation

The evaluation version of IDA doesn't allow saving, but does allow you to save its database to an IDC file (using the file produce command). Using this method you can simply save all your comments etc in these IDC files and then load it again. This is equal to saving only a bit slower (but if you have a very fast machine it is not that annoying).

Altering the program to allow easier debugging

Often when analyzing a program there are some inconvinient things, for example in the-binary when i wanted to analyze some of the code in packet type 3 using gdb i could not attach to it as it forked. There is a simple way around this by altering the binary so that it does not fork here. Simply removing the call to the fork with a lot of nops allows easier debugging.

Learning more

Becoming a reverse engineer takes some steps:
1. Be a master in assembly. For example by reading the art of assembly.
2. Start at the basics, by doing some crackme's and reverse me's on the net.
3. Read a lot, for example on fravia's old pages or search for more to read using fravia's new pages.
Experience is also very important and you have a zillion programs to practice on. A very good method of practice is to alter existing binaries to do something new. For example you can read this tutorial about adding (useless) support for encrypted mp3's to the winamp binary if you practice things like this in no time you will be an expert reverse engineer!