1. Identify and explain the purposes of the binary
The main purpose of the binary is to act as a Denial of Service (DoS) zombie.
It provides various methods of performing a DoS attack against a victim
machine.
- the DoS methods are:
- TCP SYN flooding
- UDP flooding
- ICMP ping flooding or reflecting (making SMURF attacks possible)
- UDP DNS reflecting
- the binary can provide a root shell on TCP port 23281 (password protected
with password "TfOjG")
- the binary can execute arbitrary commands with the option of sending
the output of the command to one or more specified hosts
- running DoS clients and the root shell listening process can be terminated
- it is also possible to query the binary's last executed command that
is currently running
It is possible for the binary to obfuscate the destination of its transmissions
by either supplying a set of up to 10 IP addresses that will receive traffic
or have 9 of them randomly generated. When randomly generating the addresses,
one specified address will be used at a random location.
2. Identify and explain the different features of the binary. What are
its capabilities?
[Jim and Rajeev]
This section describes the packet formats used by "the-binary" as well
as what it will perform. There are twelve different packet format types.
Each type is distinguished by the command number. This command number is
a one byte code that is found in the second byte of the packet's payload
(following the IP header and a 2 byte pseudo NVP header). Command numbers
appearing in the packet are decremented by one and this new value is used
in a case statement to parse commands. For example, command 1 is represented
as case 0 within "the-binary".
The general format of the packets includes an IP Protocol type of "11" (indicating
NVP) and the bytes "02" and "00" immediately following the IP header. In addition,
the packet length must be greater than two hundred bytes, or it will be discarded
by "the-binary".
The following is a list of command packet payload formats. Where appropriate,
we will also give the format of the reply packet. It is important to point
out that all packets are encoded with a simple substitution cipher prior to
transmission. What is represented here is the format of packets prior to
encoding and following decoding. As each command must be at least 201 bytes
long, padding of any kind may be used at the end to gain the appropriate length.
The padding is not denoted in the command formats. All IP address bytes are
in network byte order. All strings must be zero-terminated.
Command 1 (Case 0)
Description
A packet containing command 1 received by the-binary is interpreted as a
status command. The response packet includes information about whether the-binary
has activated a child process and the last command the-binary executed.
Command Received
Byte 0 Byte 1 Byte 2 Byte 3
-------------------------------------------------------------------
| Not Used | 1 | Not Used | Not Used |
| | | | |
-------------------------------------------------------------------
Reply Sent
Byte 0 Byte 1 Byte 2 Byte 3
-------------------------------------------------------------------
| 0 | 1 | 7 | 0 or 1 |
| | | | |
-------------------------------------------------------------------
| Last Command | Not Used | Not Used | Not Used |
| | | | |
-------------------------------------------------------------------
Command 2 (Case 1)
Description
A packet containing command 2 sets a global variable ("globalvar" for our
purposes, "globalvar007" in 'decompile_final.c'), sets the local IP address
(derived from the destination address of the packet), and also fills a buffer
with ten IP addresses. The IP addresses that go into the buffer depend on
the value of globalvar as follows:
globalvar == 0 : The buffer is filled with random IP addresses with one
random slot left out. The first slot is filled with the first IP address
in the packet payload.
globalvar == 2 : The buffer is filled with IP addresses taken from the packet
payload with one random slot left out which never gets filled. We suspect
that this might be a programming error.
globalvar == other values : This case is the same as when globalvar is zero
except that the random slot left out is filled with the corresponding IP address
from the packet payload.
This global variable is also used in the deliver_nvp() function (See decompile_final.c)
while making calls to send_nvp. If globalvar is zero, send_nvp is invoked
only once with only one destination IP address. If not, it is invoked 10 times
with different destination IP addresses.
Command Received
Byte 0 Byte 1 Byte 2 Byte 3
-------------------------------------------------------------------
| Not Used | 2 | Flag | IP Dest 1 |
| | | | Byte 1 |
-------------------------------------------------------------------
| IP Dest 1 | IP Dest 1 | IP Dest 1 | Continue with | ...
| Byte 2 | Byte 3 | Byte 4 | nine more IPs |
-------------------------------------------------------------------
Command 3 (Case 2)
Description
A packet containing command 3 is used to send arbitrary commands to a shell
opened by the-binary. The reply packet(s) contain the output of the command.
If the command does not result in any output, a value of 4 is returned in
byte position 1 of the payload.
Command Received
Byte 0 Byte 1 Bytes 2 to end
-------------------------------------------------------------------
| Not Used | 3 | Shell command string | ...
| | | |
-------------------------------------------------------------------
Reply Sent
Byte 0 Byte 1 Bytes 2 to end
-------------------------------------------------------------------
| Not Used | 3 | Shell command output string | ...
| | | |
-------------------------------------------------------------------
Or
Byte 0 Byte 1 Bytes 2 to end
-------------------------------------------------------------------
| Not Used | 4 | Not Used (No shell output) |
| | | |
-------------------------------------------------------------------
Command 4 (Case 3)
Description
A packet containing command 4 forks a new process that proceeds to launch
a denial of service (DoS) attack against a target using DNS reflectors while
the parent process continues with the main loop. A UDP packet with dst port
53 (DNS) is sent out every 5 minutes. The source address in the attack packets
will either by the IP address following the command, or the IP address associated
with the host name string. The flag at byte position 8 controls which is used.
Note that the victim's IP address will be the source address of the spoofed
packet. The destination address for the DNS packet is picked randomly out
of a list of 8000 hard-coded IP addresses. The data sent are DNS zone transfer
queries that are also contained in the binary.
Command Received
Byte 0 Byte 1 Byte 2 Byte 3
-------------------------------------------------------------------
| Not Used | 4 | IP Src | IP Src |
| | | Byte 1 | Byte 2 |
-------------------------------------------------------------------
| IP Src | IP Src | UDP Src | UDP Src |
| Byte 3 | Byte 4 | Port High | Port Low |
-------------------------------------------------------------------
| 0 (Use IP) | Source host name string | ...
|>0 (Use string)| |
-------------------------------------------------------------------
Command 5 (Case 4)
Description
A packet containing command 5 is used to launch a UDP or ICMP packet flooding
denial of service (DoS) attack against target host. The target host's IP address
is named after the after the command, or the command can contain an string
containing host name. The twelfth byte is a flag indicating which target
identifier should be used. The ICMP packet will always be an echo request.
Given that the source IP address may be spoofed to a victim's address, this
also enables a ping reflector attack.
Command Received
Byte 0 Byte 1 Byte 2 Byte 3
-------------------------------------------------------------------
| Not Used | 5 | 0 (ICMP) | UDP Dest Port |
| | | >0 (UDP) | |
-------------------------------------------------------------------
| IP Dest | IP Dest | IP Dest | IP Dest |
| Byte 1 | Byte 2 | Byte 3 | Byte 4 |
-------------------------------------------------------------------
| IP Src | IP Src | IP Src | IP Src |
| Byte 1 | Byte 2 | Byte 3 | Byte 4 |
-------------------------------------------------------------------
| 0 (Use Dest) | Destination host name string | ...
| >0 (Use string) | |
-------------------------------------------------------------------
Command 6 (Case 5)
Description
A packet containing command 6 forks a new process allowing the parent process
to continue to the main loop. The child process opens a TCP socket on port
23281 and listens for connections. For every incoming connection, a new process
is forked which looks for a string "Tf0jG" in the first 6 bytes of the received
data. If it is not present, it sends back a string consisting of Hex characters
"0xff 0xfb 0x01 0x00", terminates the connection and exits. If the string
is present, it does the following:
- It duplicates the standard I/O/Error to point to the socket descriptor
- It sets PATH variable to "/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/bin/:."
- It sets TERM variable to "linux"
- It unsets the HISTFILE environment variable to prevent future shell
commands from appearing in the history
So in effect, command 6 opens a shell and allows remote command execution
by redirecting standard input, output and error descriptors and all of which
is password protected.
Command Received
Byte 0 Byte 1
-----------------------------------
| Not Used | 6 |
| | |
-----------------------------------
Command 7 (Case 6)
Description
A packet containing command 7 is used to send a shell command for execution
by the-binary. This is similar to command 3. However, the difference between
this command and command 3 is that the output of the command is not returned
to the sender.
Command Sent
Byte 0 Byte 1 Bytes 2 to end
-------------------------------------------------------------------
| Not Used | 7 | Shell command string | ...
| | | |
-------------------------------------------------------------------
Command 8 (Case 7)
Description
A packet containing command 8 kills the process stored in the global variable
"active_process", if it exists. The "active_process" variable will contain
the process ID of any of the DoS zombies or the process ID of the listening
TCP port for the root shell.
Command Received
Byte 0 Byte 1
-----------------------------------
| Not Used | 8 |
| | |
-----------------------------------
Command 9 (Case 8)
Description
A packet containing command 9 is used to launch a DNS Reflector denial of
service (DoS) attack. As in command 5, the target of the attack can either
be identified by an IP address or a string representing the host name defined
in the packet. A flag indicates whether to use a random source address or
the source address specified in the command packet. Another parameter denotes
the number of packets to be sent in a batch before sleeping for 5 minutes.
Command Received
Byte 0 Byte 1 Byte 2 Byte 3
-------------------------------------------------------------------
| Not Used | 9 | IP Src | IP Src |
| | | Byte 1 | Byte 2 |
-------------------------------------------------------------------
| IP Src | IP Src | # of packets | UDP Src Port |
| Byte 3 | Byte 4 | | High Byte |
-------------------------------------------------------------------
| UDP Src Port | 0 (Use Src) | Source host name string | ...
| Low Byte |>0 (Use string)| |
-------------------------------------------------------------------
Command 10 (Case 9)
Description
A packet containing command 10 forks a new process that proceeds to launch
a TCP SYN Flood attack while the parent process continues with the main loop.
A TCP SYN packet in sent out every 5 minutes.
The ninth byte of the received packet's payload is a flag which determines
the IP source address of the TCP packets. If the flag is zero, a random IP
source address is used. Otherwise, the IP address in the payload (tenth to
thirteenth bytes) is used.
The fourteenth byte of the payload is another flag which determines the
IP destination address of the TCP packets. If this flag is zero, the IP address
in the payload (third to sixth bytes) is used. If not, a gethostbyname() is
done on the string consisting of the payload bytes from the fifteenth byte
onwards and the resulting IP address is used as the destination IP address.
Bytes seven and eight of the payload are used as the TCP destination port
while a random port between 1 and 4000 is used as the TCP source port.
Command Received
Byte 0 Byte 1 Byte 2 Byte 3
-------------------------------------------------------------------
| Not Used | 10 | IP Dest | IP Dest |
| | | Byte 1 | Byte 2 |
-------------------------------------------------------------------
| IP Dest | IP Dest | TCP Dest | TCP Dst |
| Byte 3 | Byte 4 | Port High | Port Low |
-------------------------------------------------------------------
| 0 (Use random)| IP Src | IP Src | IP Src |
|>0 (Use Src) | Byte 1 | Byte 2 | Byte 3 |
-------------------------------------------------------------------
| IP Src | 0 (Use Dest) | Destination host name string | ...
| Byte 4| | >0 (Use string)| |
-------------------------------------------------------------------
Command 11 (Case 10)
Description
A packet containing command 11 is used to launch a TCP denial of service
(DoS) attack. Instead of only sending one packet and then sleeping for 5 minutes,
the number of packets can now be specified in byte 13.
Command received
Byte 0 Byte 1 Byte 2 Byte 3
-------------------------------------------------------------------
| Not Used | 10 | IP Dest | IP Dest |
| | | Byte 1 | Byte 2 |
-------------------------------------------------------------------
| IP Dest | IP Dest | TCP Dest | TCP Dst |
| Byte 3 | Byte 4 | Port High | Port Low |
-------------------------------------------------------------------
| 0 (Use random)| IP Src | IP Src | IP Src |
|>0 (Use Src) | Byte 1 | Byte 2 | Byte 3 |
-------------------------------------------------------------------
| IP Src | # of packets | 0 (Use Dest) | Destination | ...
| Byte 4| | to send |>0 (Use string) | host name str|
-------------------------------------------------------------------
Command 12 (Case 11)
Description
A packet containing command 12 also forks a new process that proceeds to
launch a DoS attack against a target using DNS reflectors while the parent
process continues with the main loop. One difference between this and command
4 is the manner in which the IP source and destination addresses of the UDP
packets are chosen. Another difference is that in this case, the number of
packets to be sent before going to sleep is indicated by the eleventh byte
of the payload while it was 1 by default in the case of command 4.
If the payload bytes 6 - 9 are not zeroes, they are taken as the IP source
address of the UDP packets. If not, a random IP address is used. The thirteenth
byte of the payload is a flag that indicates what destination IP address is
used for the UDP packets. If the flag is zero, bytes 2 to 5 are used for the
destination IP address. If not, a gethostbyname() is done on the string consisting
of the payload bytes from the fourteenth byte onwards and the resulting IP
address is used as the destination IP address.
If bytes 11 and 12 of the payload are not zeroes, they are used as the UDP
source port. If not, a random port between 0 and 29999 is used. The UDP destination
port is always 53 (DNS).
Command Received
Byte 0 Byte 1 Byte 2 Byte 3
-------------------------------------------------------------------
| Not Used | 12 | IP Dest | IP Dest |
| | | Byte 1 | Byte 2 |
-------------------------------------------------------------------
| IP Dest | IP Dest | IP Src | IP Src |
| Byte 3 | Byte 4 | Byte 1 | Byte 2 |
-------------------------------------------------------------------
| IP Src | IP Src | Number of Sends| UDP Src |
| Byte 3 | Byte 4 | Before Sleep | Port High |
-------------------------------------------------------------------
| UDP Src | 0 (Use Dest) | Destination host name string | ...
| Port Low |>0 (Use string) | |
-------------------------------------------------------------------
3. The binary uses a network data encoding process. Identify the encoding
process and develop a decoder for it
[Ben]
The transmitted packets are encoded using a version of cipher block chaining
with the encryption algorithm being a monoalphabetic substitution cipher with
a shift of 23. In cipher block chaining, an encrypted block is added to the
previous encrypted block to be output. The size of the blocks in this implementation
are single bytes. A monoalphabetic substitution cipher simply substitutes
one letter in an alphabet with another. In this case, the shift is
23 values greater with a wraparound to the start so a value of 1 becomes
24 and 255 becomes 22.
In equation form, a given byte b_i is encrypted to b_i' by the following:
b_i' = [ { b_i + 23 } + b_(i-1)' ] mod 256
Decryption takes place by performing the opposite transformation (simply
solve the above for b_i).
b_i = [ b_i' - b_(i-1)' - 23 ] mod 256
Our perl script 'decode-packets.pl' is designed to operate on the output
from the snort log of these packets. The specific command used was
snort -r ./snort.log | decode-packets.pl
The output of the decoder for the test data provided can be viewed in the
file 'decoder_output.txt'.
Once the payload of the packets are decoded, they are then processed based
on their contents. The first two bytes will indicate if the traffic
is inbound (02 00) or outbound (03 00). As we do not have the controller
binary nor a substantial amount of collected data, only a small amount of
interpretation is performed on outbound packets. However, we are able to perform
a substantial information on inbound packets. In this case, we break
down each inbound packet based on the command requested.
4. Identify one method of detecting this network traffic using a method
that is not just specific to this situation, but other ones as well.
The most distinguishing characteristic for the binary is its NVP traffic.
There are control packets that are received, where the first two bytes of
the IP payload are 02 and 00, respectively. Outgoing NVP packets have 03
and 00 as their first two bytes, and in this case there can be up to 10 packets
with identical payloads emanating from the host.
Furthermore, if there is a high volume of either UDP DNS traffic, TCP SYNs,
ICMP echo requests, or general UDP traffic coming from the host, one can suspect
that some sort of DoS zombie is running and the host should be investigated.
5. Identify and explain any techniques in the binary that protect it
from being analyzed or reverse engineered.
The binary was statically linked to the standard C library. This prevented
us from determining what library calls were being made or interposing our
own library to analyze data being passed to the library calls. This also
made the reverse engineering effort much harder, as the code for the library
was included in the decompiler output. Furthermore, gdb was unable to follow
the execution of forked children.
Also, the binary was stripped, which removes debugging information and
the symbol table from the executable. Again, this made the reverse engineering
process hard, as function and variable names often contain information regarding
their purposes.
While the above two techniques were applied to the executable, we do not
think that this was done in an effort to protect it from being analyzed. The
static linking of the library was probably done to make it more portable and
the stripping to reduce the size of the binary.
6. Identify two tools in the past that have demonstrated similar functionality
We looked at Dave Dittrich's DDoS Attacks / tools page and found the
following about the Tribe Flood Network (TFN) tool:
"TFN is made up of client and daemon programs, which implement
a
distributed network denial of service tool capable of waging ICMP
flood, SYN flood, UDP flood, and Smurf style attacks, as well as
providing an "on demand" root shell bound to a TCP port."
A successor to TFN
would be stacheldraht, which added some encryption, automated update capability
and some extra features.
Tools like 'knight.c' also seem to have similar capabilities but just glancing
at the source code it was not clear if the IRC portion there plays a dominant
role or not.
Bonus 1. What kind of information can be derived about the person who
developed this tool? For example, what is their skill level?
It seems that the code used for generating the binary is not original. The
fact that the DNS query packets (see file 'dns_data.c') for the two-letter
domains are malformed points to a quick copy, paste, change approach. Also,
the fact that we see code like 'htons(rand())' supports that assumption.
The code looks somewhat clumsy at times. In certain instances, entire buffers
are copied or adjusted by a few bytes in a loop byte for byte where a simple
pointer adjustment would have done the trick. In the code for command 2 (case
1), where the buffer for the reply IP addresses is generated, the slot that
was randomly skipped never gets filled when supplying the addresses instead
of randomly generating them. This could be a programmer's error. The decoding
routine first uses a temporary buffer and some loops to copy data and right
after that an sprintf statement is executed that performs a similar thing
(however, if there are any null characters in the buffer, the decoded packet
might be truncated).
With the exception of the 'socket' call, there is no error checking whatsoever,
which is important at least for the 'recv' routine as well.
The function 'get_ip_addr' takes an address string and returns the IP address
in network byte order. Not only does this duplicate the 'inet_addr' function
that is also used, but the only invocation of it is with a string that was
generated from an IP address in network byte order!
There are no strings or messages in the code that would reveal the origin
of the coder. The fact that some of the DNS queries contain European TLDs
(.de, .es, .gr, .ie) could hint at some European origin.
B2 What advancements in tools with similar purposes can we expect in
the future?
- better encryption
- less mistakes in coding
- more rootkit-like behavior in terms of hiding itself on the system
- more multi-purpose capabilities
- better ease of use for person who controls it