Answers: The HoneyNet Project Reverse Challenge 2002


sean.burford@adelaide.edu.au

29/May/2002

1. Identify and explain the purpose of the binary.

The-binary is a back door program designed to be installed on compromised Linux machines. It acts as a network server, providing access for the cracker to the compromised system.

Its purpose is to provide the cracker with anonymity, the ability to execute shell commands as root on the compromised system, and the ability to launch a variety of Denial of Service (DOS) attacks against third parties.

Presumably the client would be able capable of controlling many copies of the-binary on different compromised hosts in a coordinated fashion, providing Distributed Denial of Service (DDOS) capabilities.

2. Identify and explain the different features of the binary. What are its capabilities?

The-binary provides remote shell access to a compromised machine, and also acts as a launching pad for various Denial of Service attacks.

The 12 commands supported by the-binary are:
1 status
2 configure
3 execute command under csh and send back result
4 DNS bounce flood 1
5 UDP/ICMP fragment flood
6 open csh bindshell
7 execute command under csh with no response
8 stop flood or shell command
9 DNS bounce flood 2
10 SYN flood 1
11 SYN flood 2
12 DNS bounce flood 3

the-binary uses IP protocol 11 for communication. This evades simple searches for evidence of a compromise, such as TCP and UDP network scans by nmap, and basic interpretation of netstat output. IP services listening for protocol 11 are shown in the output of "netstat -an" as:

raw        0      0 0.0.0.0:11              0.0.0.0:*               7
This shows in the nmap protocol scan -sO as:
11         open        nvp-ii                  

The-binary uses encoding on network control and response packets. This slows down analysis of the protocol used by the-binary through packet intercepts, as similar command arguments can result in largely different packets and there is no ASCII cleartext of shell commands or IP addresses present in the control or response packets.

The ability to send responses to a list of machines, coupled with the ability to accept commands sent to from a spoofed source address, means that communications to and from the-binary can be performed in a way that makes it difficult to trace them back to the cracker. As the-binary accepts commands sent to the broadcast address, can execute commands without sending replies and can spoof the source address of all flood packets the compromised machine does not have to be identified in any IP network packets. These facilities can be used by a skilled cracker to hide his identity, and that of the compromised machine, from network sniffers.

3. The binary uses a network data encoding process. Identify the encoding process and develop a decoder for it.

Each network packet sent between the-binary and the client that controls it consists of an IP header, a two byte command and a variable length argument area. The argument area contains parameters for the command or response, and is encoded using the cipher detailed below.

Encoding of the data area is done on a per byte basis, starting at the lowest byte of the data area (closest to the header). 0x17 is added to the first byte to create the first byte of encoded data. Each byte of encoded data after the first is the sum of the cleartext byte to be encoded, the previous encoded value and 0x17. If any encoded value exceeds 255, the value is binary anded with 255 to produce a value between 0 and 255.

The data area can be decoded by taking each encoded byte, starting at the end of the data area, and subtracting the sum of the previous encoded value and 0x17 from it. The first byte of the data area is decoded by subtracting 0x17 from it. If any decoded value is below 0, the value is binary anded with 255 to produce a value between 0 and 255.

The decoding process can be expressed as the C function:

/* dst[x] = ( src[x] - src[x-1] - 23 ) & 0xFF, right to left */
void decode(unsigned char *data, unsigned int length)
{
        unsigned int pos;

        for(pos=length;pos>0;pos--)
                data[pos] -= data[pos-1] + 0x17;

        data[0] -= 0x17;
}

4. Identify one method of detecting this network traffic using a method that is not just specific to the situation, but to other ones as well.

As the-binary is controlled by packets using an unusual IP protocol, I decided that this was a good way to detect the network traffic related to this program and others like it. Since Snort (an Open Source network intrusion detection system) has an easy to use rule system for configuring its alerts, I decided to prototype this with Snort.

Snort is configured using rule files. The rule files list rules that can specify which packets and data flows to match, and what to tell the Snort user about those packets.

At first glance, I could not find any way to specify how to monitor for particular IP protocols in the Snort documentation. A bit of research uncovered the ip_proto keyword. I created and tested the rule:

alert ip $EXTERNAL_NET any <> $HOME_NET 0 (msg:"Traffic on unusual IP protocol" 
; ip_proto: !6; ip_proto: !17; ip_proto: !1; classtype:misc-activity; rev:1;)
which detected the traffic. I tried adding more protocols to the exclusion list, but Snort started detecting all traffic so I left it at that. The greater than (%gt;) operator does not seem to work with ip_proto, and it does not seem to be able to handle lists. In a production environment you would need to ignore 5 or so major protocols (TCP, UDP, ICMP, IGRP, various routing protocols such as OSPF) to cut down on false positives. Protocol numbers are assigned by IANA and available at http://www.iana.org/assignments/protocol-numbers

5. Identify and explain any techniques in the binary that protect it from being analysed or reverse engineered.

The reverse engineering of the-binary was made more difficult because it was statically linked, stripped, uses fork(2), and contained some redundant code.

The-binary was statically linked, ie it was self contained as it contained both program code and library functions. This made it difficult to differentiate library functions from original code, slowing the analysis. It also made it impossible to load alternate libraries to replace certain library calls (such as fork) during execution tracing. This may have also prevented gdb's follow-fork-mode from tracing children during execution.

The-binary had its symbol table stripped. This table normally labels each function with a name, and would have made identifying library calls very easy. I identified the library that the-binary was linked against with strings(1), and used syscalls and the libc source to identify library functions. Michal Zwalski has produced a tool named dress(1), which reconstructs the ELF symbol table of a stripped binary from function signatures. Had I known of dress earlier I would have used it to reproduce the symbol table.

Multiple fork(2) calls are used within the-binary. There are two when the-binary is first executed, and a further fork whenever a child process is required for running shell commands or generating DOS packets. Tracing through fork(2) calls is difficult, as they spawn an additional process. gdb provides follow-fork-mode for tracing through forks and choosing to follow the parent or the child, however this did not work (probably because the-binary was statically linked.) After starting the-binary, I used gdb's attach command to attach to the running process for execution tracing and debugging.

The-binary contains redundant code in some functions. This redundant code performs the same function twice, for example in decode(), the next byte of the decoded string is prepended using a for loop, and then an sprintf(3) is used to do exactly the same thing with the original strings. This may have been done to confuse people doing debugging, or may have been code accidentally left in during development.

6. Identify two tools in the past that have demonstrated similar functionality.

The Trinity v3 Distributed Denial of Service tool, as documented at http://www.iss.net/security_center/alerts/advise59.php, is another DDOS tool for Linux that provides a variety of flood types. Trinity V3 is controlled by IRC, so the control channel differs from the-binary. It was also installed with a separate backdoor, whilst the-binary contains an integrated backdoor.

Trinity V3 provides the following flood commands: udpflood, fragmentflood, synflood, rstflood, randomflagsflood, ackflood, establishflood, nullflood

Some more DDOS tools are documented at the following URLs:
http://staff.washington.edu/dittrich/misc/tfn.analysis
http://staff.washington.edu/dittrich/misc/trinoo.analysis
http://staff.washington.edu/dittrich/misc/stacheldraht.analysis

The B0CK backdoor by Vecna, as documented by SANS at http://rr.sans.org/covertchannels/covert_shells.php, provides sophisticated covert channel control of a backdoor. Commands to B0CK are encoded into the source IP of IGMP packets, with the address for responses configured using a command packet. This is similar to the way the-binary accepts packets with a forged source address, and responds to a set of addresses configured using the configure command.

The above SANS paper also contains references to some other backdoors.

Bonus 1. What kind of information can be derived about the person who developed this tool. For example, what is their skill level?

difficult to tell what is customised and what is original variables.
3 ways provided to execute shell commands
several types of DOS available
likes csh
lots of copy/pasting of code (or inline includes)

Knowledge of networking
DNS SOA used to magnify traffic.

May be a University student, as one indicated by the DNS queries .com, .org, .net, .edu and .usc.edu. USC.edu is out of place as it is not a top level domain.

Person who deployed the tool:

# Linked with gcc 2.7.2.l.2 (.comment)
#  against The Linux C library 5.3.12 (.rodata 0x20E7C)
#  using yplib.c,v 2.6 1994/05/27 14:34:43 swen Exp (.rodata 0x21004)

Bonus 2. What advancements in tools with similar purposes can we expect in the future?

Piggyback communications over covert channels
eg Data field of ICMP messages (bounced off other hosts), or IP header fields, and support for bouncing back response?

Better hiding from host OS
This can already be achieved using separate root kit to hide from ps and netstat

Better protection from reverse engineering
Encrypted binaries, and the shortcoming of Unix reversing tools, are discussed at length in phrack 58 article 5 http://www.phrack.org/show.php?p=58&a=5
An ELF armouring program written by Scut from Teso is available at http://www.team-teso.net/releases.php

More capabilities
sniffer
distributed control (passing on commands to other servers)
network scanning and reporting of summarised results
auto rooter for distribution