1. Identify and explain the purposes of the binary

The main purpose of the binary is to act as a Denial of Service (DoS) zombie. It provides various methods of performing a DoS attack against a victim machine.
It is possible for the binary to obfuscate the destination of its transmissions by either supplying a set of up to 10 IP addresses that will receive traffic or have 9 of them randomly generated. When randomly generating the addresses, one specified address will be used at a random location.

2. Identify and explain the different features of the binary. What are its capabilities?

[Jim and Rajeev]

This section describes the packet formats used by "the-binary" as well as what it will perform. There are twelve different packet format types. Each type is distinguished by the command number. This command number is a one byte code that is found in the second byte of the packet's payload (following the IP header and a 2 byte pseudo NVP header). Command numbers appearing in the packet are decremented by one and this new value is used in a case statement to parse commands. For example, command 1 is represented as case 0 within "the-binary".

The general format of the packets includes an IP Protocol type of "11" (indicating NVP) and the bytes "02" and "00" immediately following the IP header. In addition, the packet length must be greater than two hundred bytes, or it will be discarded by "the-binary".

The following is a list of command packet payload formats. Where appropriate, we will also give the format of the reply packet. It is important to point out that all packets are encoded with a simple substitution cipher prior to transmission. What is represented here is the format of packets prior to encoding and following decoding. As each command must be at least 201 bytes long, padding of any kind may be used at the end to gain the appropriate length. The padding is not denoted in the command formats. All IP address bytes are in network byte order. All strings must be zero-terminated.

Command 1 (Case 0)

Description

A packet containing command 1 received by the-binary is interpreted as a status command. The response packet includes information about whether the-binary has activated a child process  and the last command the-binary executed.

Command Received

         Byte 0           Byte 1           Byte 2          Byte 3
   -------------------------------------------------------------------
   |    Not Used     |       1       |   Not Used    |    Not Used   |   
   |                 |               |               |               |
   -------------------------------------------------------------------

Reply Sent

         Byte 0           Byte 1           Byte 2          Byte 3
   -------------------------------------------------------------------
   |        0        |       1       |       7       |     0 or 1    |   
   |                 |               |               |               |
   -------------------------------------------------------------------
   | Last Command    |  Not Used     |  Not Used     |  Not Used     |   
   |                 |               |               |               |
   -------------------------------------------------------------------

Command 2 (Case 1)

Description

A packet containing command 2 sets a global variable ("globalvar" for our purposes, "globalvar007" in 'decompile_final.c'), sets the local IP address (derived from the destination address of the packet), and also fills a buffer with ten IP addresses. The IP addresses that go into the buffer depend on the value of globalvar as follows:

globalvar == 0 : The buffer is filled with random IP addresses with one random slot left out. The first slot is filled with the first IP address in the packet payload.

globalvar == 2 : The buffer is filled with IP addresses taken from the packet payload with one random slot left out which never gets filled. We suspect that this might be a programming error.

globalvar == other values : This case is the same as when globalvar is zero except that the random slot left out is filled with the corresponding IP address from the packet payload. 
 
This global variable is also used in the deliver_nvp() function (See decompile_final.c) while making calls to send_nvp. If globalvar is zero, send_nvp is invoked only once with only one destination IP address. If not, it is invoked 10 times with different destination IP addresses.

Command Received

       Byte 0            Byte 1          Byte 2           Byte 3
   -------------------------------------------------------------------
   |    Not Used   |       2        |    Flag        |  IP Dest 1    |   
   |               |                |                |    Byte 1     |
   -------------------------------------------------------------------
   |   IP Dest 1   |   IP Dest 1    |   IP Dest 1    | Continue with | ...
   |     Byte 2    |     Byte 3     |     Byte 4     | nine more IPs |
   -------------------------------------------------------------------

Command 3 (Case 2)

Description

A packet containing command 3 is used to send arbitrary commands to a shell opened by the-binary. The reply packet(s) contain the output of the command. If the command does not result in any output, a value of 4 is returned in byte position 1 of the payload.

Command Received

         Byte 0           Byte 1           Bytes 2 to end   
   -------------------------------------------------------------------
   |    Not Used     |       3       |     Shell command string      | ...
   |                 |               |                               |
   -------------------------------------------------------------------

Reply Sent

         Byte 0           Byte 1           Bytes 2 to end 
   -------------------------------------------------------------------
   |    Not Used     |       3       | Shell command output string   | ...
   |                 |               |                               |
   -------------------------------------------------------------------
   Or
         Byte 0           Byte 1           Bytes 2 to end 
   -------------------------------------------------------------------
   |    Not Used     |       4       |  Not Used (No shell output)   |   
   |                 |               |                               |
   -------------------------------------------------------------------

Command 4 (Case 3)

Description

A packet containing command 4 forks a new process that proceeds to launch a denial of service (DoS) attack against a target using DNS reflectors while the parent process continues with the main loop. A UDP packet with dst port 53 (DNS) is sent out every 5 minutes. The source address in the attack packets will either by the IP address following the command, or the IP address associated with the host name string. The flag at byte position 8 controls which is used. Note that the victim's IP address will be the source address of the spoofed packet. The destination address for the DNS packet is picked randomly out of a list of 8000 hard-coded IP addresses. The data sent are DNS zone transfer queries that are also contained in the binary.

Command Received

        Byte 0            Byte 1          Byte 2           Byte 3
   -------------------------------------------------------------------
   |    Not Used   |       4        |    IP Src      |    IP Src     |   
   |               |                |    Byte 1      |    Byte 2     |
   -------------------------------------------------------------------
   |   IP Src      |   IP Src       |    UDP Src     |   UDP Src     |    
   |   Byte 3      |   Byte 4       |   Port High    |   Port Low    |
   -------------------------------------------------------------------
   | 0 (Use IP)    | Source host name string                         | ...
   |>0 (Use string)|                                                 |
   -------------------------------------------------------------------

Command 5 (Case 4)

Description

A packet containing command 5 is used to launch a UDP or ICMP packet flooding denial of service (DoS) attack against target host. The target host's IP address is named after the after the command, or the command can contain an string containing host name. The twelfth byte is a flag indicating which target identifier should be used. The ICMP packet will always be an echo request. Given that the source IP address may be spoofed to a victim's address, this also enables a ping reflector attack.

Command Received

         Byte 0           Byte 1           Byte 2          Byte 3
   -------------------------------------------------------------------   
|    Not Used     |       5       |  0 (ICMP)     | UDP Dest Port |
   |                 |               | >0 (UDP)      |               |
   -------------------------------------------------------------------
   |    IP Dest      |   IP Dest     |   IP Dest     |   IP Dest     |
   |     Byte 1      |    Byte 2     |    Byte  3    |    Byte 4     |
   -------------------------------------------------------------------
   |    IP Src       |   IP Src      |   IP Src      |   IP Src      |
   |     Byte 1      |    Byte 2     |    Byte 3     |    Byte 4     |
   -------------------------------------------------------------------
   |  0 (Use Dest)   |   Destination host name string                | ...
   | >0 (Use string) |                                               |
   -------------------------------------------------------------------

Command 6 (Case 5)

Description

A packet containing command 6 forks a new process allowing the parent process to continue to the main loop. The child process opens a TCP socket on port 23281 and listens for connections. For every incoming connection, a new process is forked which looks for a string "Tf0jG" in the first 6 bytes of the received data. If it is not present, it sends back a string consisting of Hex characters "0xff 0xfb 0x01 0x00", terminates the connection and exits. If the string is present, it does the following:

  1. It duplicates the standard I/O/Error to point to the socket descriptor
  2. It sets PATH variable to "/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/bin/:."
  3. It sets TERM variable to "linux"
  4. It unsets the HISTFILE environment variable to prevent future shell commands from appearing in the history

So in effect, command 6 opens a shell and allows remote command execution by redirecting standard input, output and error descriptors and all of which is password protected.

Command Received

         Byte 0            Byte 1          
   -----------------------------------
   |   Not Used     |       6        |      
   |                |                |  
   -----------------------------------

Command 7 (Case 6)

Description

A packet containing command 7 is used to send a shell command for execution by the-binary. This is similar to command 3. However, the difference between this command and command 3 is that the output of the command is not returned to the sender.

Command Sent

         Byte 0           Byte 1           Bytes 2 to end   
   -------------------------------------------------------------------
   |    Not Used     |       7       |     Shell command string      | ...
   |                 |               |                               |
   -------------------------------------------------------------------

Command 8 (Case 7)

Description

A packet containing command 8 kills the process stored in the global variable "active_process", if it exists. The "active_process" variable will contain the process ID of any of the DoS zombies or the process ID of the listening TCP port for the root shell.

Command Received

        Byte 0          Byte 1
   -----------------------------------
   |    Not Used   |       8         |      
   |               |                 |  
   -----------------------------------

Command 9 (Case 8)

Description

A packet containing command 9 is used to launch a DNS Reflector denial of service (DoS) attack. As in command 5, the target of the attack can either be identified by an IP address or a string representing the host name defined in the packet. A flag indicates whether to use a random source address or the source address specified in the command packet. Another parameter denotes the number of packets to be sent in a batch before sleeping for 5 minutes.

Command Received

         Byte 0           Byte 1           Byte 2          Byte 3
   -------------------------------------------------------------------
   |    Not Used     |       9       |   IP Src      |   IP Src      |
   |                 |               |    Byte 1     |    Byte 2     |
   -------------------------------------------------------------------
   |    IP Src       |   IP Src      | # of packets | UDP Src Port |
   |     Byte 3      |    Byte 4     |      |   High Byte   |
   -------------------------------------------------------------------
   |  UDP Src Port   | 0 (Use Src)   | Source host name string  | ...
   |    Low Byte     |>0 (Use string)|                               |
   -------------------------------------------------------------------

Command 10 (Case 9)

Description

A packet containing command 10 forks a new process that proceeds to launch a TCP SYN Flood attack while the parent process continues with the main loop. A TCP SYN packet in sent out every 5 minutes.

The ninth byte of the received packet's payload is a flag which determines the IP source address of the TCP packets. If the flag is zero, a random IP source address is used. Otherwise, the IP address in the payload (tenth to thirteenth bytes) is used.

The fourteenth byte of the payload is another flag which determines the IP destination address of the TCP packets. If this flag is zero, the IP address in the payload (third to sixth bytes) is used. If not, a gethostbyname() is done on the string consisting of the payload bytes from the fifteenth byte onwards and the resulting IP address is used as the destination IP address.

Bytes seven and eight of the payload are used as the TCP destination port while a random port between 1 and 4000 is used as the TCP source port.

Command Received

        Byte 0         Byte 1             Byte 2           Byte 3
   -------------------------------------------------------------------
   |     Not Used  |     10         |    IP Dest      |    IP Dest   |   
   |               |                |     Byte 1      |     Byte 2   |
   -------------------------------------------------------------------
   |    IP Dest    |    IP Dest     |    TCP Dest     |   TCP Dst    |  
   |     Byte 3    |     Byte 4     |    Port High    |   Port Low   |
   -------------------------------------------------------------------
   | 0 (Use random)|    IP Src      |    IP Src       |     IP Src   |
   |>0 (Use Src)   |    Byte 1      |    Byte 2       |     Byte 3   |
   -------------------------------------------------------------------
   |    IP Src     |  0 (Use Dest)  |  Destination host name string  | ...
   |    Byte 4|    | >0 (Use string)|                                |
   -------------------------------------------------------------------

Command 11 (Case 10)

Description

A packet containing command 11 is used to launch a TCP denial of service (DoS) attack. Instead of only sending one packet and then sleeping for 5 minutes, the number of packets can now be specified in byte 13.

Command received

        Byte 0         Byte 1             Byte 2           Byte 3
   -------------------------------------------------------------------
   |     Not Used  |     10         |    IP Dest      |    IP Dest   |   
   |               |                |     Byte 1      |     Byte 2   |
   -------------------------------------------------------------------
   |    IP Dest    |    IP Dest     |    TCP Dest     |   TCP Dst    |  
   |     Byte 3    |     Byte 4     |    Port High    |   Port Low   |
   -------------------------------------------------------------------
   | 0 (Use random)|    IP Src      |    IP Src       |     IP Src   |
   |>0 (Use Src)   |    Byte 1      |    Byte 2       |     Byte 3   |
   -------------------------------------------------------------------
   |    IP Src     |  # of packets | 0 (Use Dest)  | Destination | ...
   |    Byte 4|    | to send |>0 (Use string) | host name str|
   -------------------------------------------------------------------

Command 12 (Case 11)

Description

A packet containing command 12 also forks a new process that proceeds to launch a DoS attack against a target using DNS reflectors while the parent process continues with the main loop. One difference between this and command 4 is the manner in which the IP source and destination addresses of the UDP packets are chosen. Another difference is that in this case, the number of packets to be sent before going to sleep is indicated by the eleventh byte of the payload while it was 1 by default in the case of command 4.

If the payload bytes 6 - 9 are not zeroes, they are taken as the IP source address of the UDP packets. If not, a random IP address is used. The thirteenth byte of the payload is a flag that indicates what destination IP address is used for the UDP packets. If the flag is zero, bytes 2 to 5 are used for the destination IP address. If not, a gethostbyname() is done on the string consisting of the payload bytes from the fourteenth byte onwards and the resulting IP address is used as the destination IP address.

If bytes 11 and 12 of the payload are not zeroes, they are used as the UDP source port. If not, a random port between 0 and 29999 is used. The UDP destination port is always 53 (DNS).

Command Received

         Byte 0          Byte 1             Byte 2         Byte 3
   -------------------------------------------------------------------
   |     Not Used   |      12        |      IP Dest   |    IP Dest   |   
   |                |                |      Byte 1    |    Byte 2    |
   -------------------------------------------------------------------
   |     IP Dest    |    IP Dest     |      IP Src    |    IP Src    |  
   |     Byte 3     |    Byte 4      |      Byte 1    |    Byte 2    |
   -------------------------------------------------------------------
   |     IP Src     |    IP Src      | Number of Sends|    UDP Src   |
   |     Byte 3     |    Byte 4      |  Before Sleep  |   Port High  |     
   -------------------------------------------------------------------
   |     UDP Src    | 0 (Use Dest)   |  Destination host name string | ...
   |    Port Low    |>0 (Use string) |                               |
   -------------------------------------------------------------------

3. The binary uses a network data encoding process. Identify the encoding process and develop a decoder for it

[Ben]

The transmitted packets are encoded using a version of cipher block chaining with the encryption algorithm being a monoalphabetic substitution cipher with a shift of 23. In cipher block chaining, an encrypted block is added to the previous encrypted block to be output. The size of the blocks in this implementation are single bytes.  A monoalphabetic substitution cipher simply substitutes one letter in an alphabet with another.  In this case, the shift is 23 values greater with a wraparound to the start so a value of 1 becomes 24 and 255 becomes 22.

In equation form, a given byte b_i is encrypted to b_i' by the following:

    b_i' = [ { b_i + 23 } + b_(i-1)' ] mod 256

Decryption takes place by performing the opposite transformation (simply solve the above for b_i).

    b_i = [ b_i' - b_(i-1)' - 23 ] mod 256

Our perl script 'decode-packets.pl' is designed to operate on the output from the snort log of these packets.  The specific command used was

    snort -r ./snort.log | decode-packets.pl

The output of the decoder for the test data provided can be viewed in the file 'decoder_output.txt'.

Once the payload of the packets are decoded, they are then processed based on their contents.  The first two bytes will indicate if the traffic is inbound (02 00) or outbound (03 00).  As we do not have the controller binary nor a substantial amount of collected data, only a small amount of interpretation is performed on outbound packets. However, we are able to perform a substantial information on inbound packets.  In this case, we break down each inbound packet based on the command requested.

4. Identify one method of detecting this network traffic using a method that is not just specific to this situation, but other ones as well.

The most distinguishing characteristic for the binary is its NVP traffic. There are control packets that are received, where the first two bytes of the IP payload are 02 and 00, respectively. Outgoing NVP packets have 03 and 00 as their first two bytes, and in this case there can be up to 10 packets with identical payloads emanating from the host.

Furthermore, if there is a high volume of either UDP DNS traffic, TCP SYNs, ICMP echo requests, or general UDP traffic coming from the host, one can suspect that some sort of DoS zombie is running and the host should be investigated.

5. Identify and explain any techniques in the binary that protect it from being analyzed or reverse engineered.

The binary was statically linked to the standard C library. This prevented us from determining what library calls were being made or interposing our own library to analyze data being passed to the library calls. This also made the reverse engineering effort much harder, as the code for the library was included in the decompiler output. Furthermore, gdb was unable to follow the execution of forked children.

Also, the binary was stripped, which removes debugging information and the symbol table from the executable. Again, this made the reverse engineering process hard, as function and variable names often contain information regarding their purposes.

While the above two techniques were applied to the executable, we do not think that this was done in an effort to protect it from being analyzed. The static linking of the library was probably done to make it more portable and the stripping to reduce the size of the binary.

6. Identify two tools in the past that have demonstrated similar functionality

We looked at  Dave Dittrich's DDoS Attacks / tools page and found the following about the Tribe Flood Network (TFN) tool:
"TFN is made up of client and daemon programs, which implement a
distributed network denial of service tool capable of waging ICMP
flood, SYN flood, UDP flood, and Smurf style attacks, as well as
providing an "on demand" root shell bound to a TCP port."
A successor to TFN would be stacheldraht, which added some encryption, automated update capability and some extra features.

Tools like 'knight.c' also seem to have similar capabilities but just glancing at the source code it was not clear if the IRC portion there plays a dominant role or not.

Bonus 1. What kind of information can be derived about the person who developed this tool? For example, what is their skill level?

It seems that the code used for generating the binary is not original. The fact that the DNS query packets (see file 'dns_data.c') for the two-letter domains are malformed points to a quick copy, paste, change approach. Also, the fact that we see code like 'htons(rand())' supports that assumption.

The code looks somewhat clumsy at times. In certain instances, entire buffers are copied or adjusted by a few bytes in a loop byte for byte where a simple pointer adjustment would have done the trick. In the code for command 2 (case 1), where the buffer for the reply IP addresses is generated, the slot that was randomly skipped never gets filled when supplying the addresses instead of randomly generating them. This could be a programmer's error. The decoding routine first uses a temporary buffer and some loops to copy data and right after that an sprintf statement is executed that performs a similar thing (however, if there are any null characters in the buffer, the decoded packet might be truncated).

With the exception of the 'socket' call, there is no error checking whatsoever, which is important at least for the 'recv' routine as well.

The function 'get_ip_addr' takes an address string and returns the IP address in network byte order. Not only does this duplicate the 'inet_addr' function that is also used, but the only invocation of it is with a string that was generated from an IP address in network byte order!

There are no strings or messages in the code that would reveal the origin of the coder. The fact that some of the DNS queries contain European TLDs (.de, .es, .gr, .ie) could hint at some European origin.

B2 What advancements in tools with similar purposes can we expect in the future?