The first step in analysis was to establish a safe containment and perform initial review of the capabilities of the binary. An existing VMware 'guest' virtual machine was copied / archived to retain the ability to compare persistent data against a known starting point. The guest VM was initially setup with no network connectivity at all, and the-binary was run from a root commandline. diagnostics follow:
ps output: USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 10527 0.0 0.0 244 72 ? S 09:28 0:00 [mingetty] netstat -an output: Proto Recv-Q Send-Q Local Address Foreign Address State raw 0 0 0.0.0.0:11 0.0.0.0:* 7 lsof output: malcode 31923 root cwd DIR 8,2 1024 2 / malcode 31923 root rtd DIR 8,2 1024 2 / malcode 31923 root txt REG 8,2 205108 30258 /root/malcode malcode 31923 root 0u raw 1673 00000000:000B->00000000:0000 st=07 grep 31932 root 1w REG 8,2 0 30259 /root/malcode.notes
I started the analysis by using simple tools to inspect the binarary and progressively delve deeper into its behavior and design. The 'strings' command was used to look for the basic elements and inspect for behavior keys (see bin.strings.txt). The following strings immediately stood out:
malcode / trojan keys /tmp/.hj237349 /bin/csh -f -c "%s" 1> %s 2>&1 TfOjG *nazgul* /bin/csh -f -c "%s" generic code keys @(#) The Linux C library 5.3.12 nospoof RESOLV_SPOOF_CHECK gethostbyaddr: %s != %u.%u.%u.%u, possible spoof attempt
Because the test system was running on a minimal distribution which did not include the "C-shell" I built / installed tcsh from source and ran the code again to establish whether its use of tcsh was to futher compromise the operating system. This seemed not to be the case. I ran MD5 checksums of the system before and after each early test, checking that 'md5sum' and some other critical binaries were not touched / modified / accessed by the binary. At this point I mostly narrowed the focus of analysis upon the binary's operation and network traffic.
At this point I adjusted the VMware settings first to 'host-only' network access, after establishing that analysis would require network connectivity, and eventually to a virtual 'NAT' network interface, allowing the binary to see the Internet through a controlled interface. For most of the testing the Host OS was configured using IPTABLES to disallow traffic on the binary's control channel, protocol 11.
I spent some time working with the disassembled code and obtained a feel for how the binary operated. At this point I set aside the project until the challenge released the network data for testing, expecting that it would be easier to use valid network traffic to analyse the operational behavior than to undertake a fullscale code analysis.
The release of the second set of binary data coincided with the release of 'fenris' a tracing & debugging tool which included a useful set of notes on the operation of the binary.
I replicated Michal Zalewski's results, and chose to mostly rely on 'strace' which was able to trace the fork()-ed child processes and report all of the system calls and critically to dump the data read and written to files and to the communication sockets.
Using strace I was able to obtain a fairly thorough picture of how the code operates. Working on the sandboxed host a typical command would be:
strace -ff -o cmd_02 -e read=0,1,2,3,4 -e write=0,1,2,3,4 ./the-binary
I also forced the binary to core-dump at numerous points along the way, using 'xdump' and diff to generate comparison information on the state of the code in different execution runs.
It is likely that this malware was either developed as an addtion to the 'nc' (netcat) source. The strings extracted match many of the strings which can be extracted from a current 'nc' binary, including: 'warn' 'nowarn' spoof'. It seems unlikely that the malware was attached to a copy of 'nc' as part of the hostile injection, as the significant strings are buried in the midst of the code.
sendip -p ipv4 -is 192.168.1.1 -ip 11 -ii 4095 -f
tcpdump -s 1500 -w addrs.dat -i vmnet8 'host 192.168.1.13 and port not 22' &
This records the network traffic between the binary and both it's controlling station and result data reported. I knew both from the string data extracted from the binary as well as from the 'strace' data that the code is using /bin/csh to execute the command that was injected over it's control it was a simple matter to remove /bin/csh, which was only a symlink for /usr/bin/tcsh and replace it with the following script:
echo $$ $* >> /root/.log
exec $3 $4 $5
This allows the plaintext to be extracted with 'tail -f /root/.log'. Using this approach to watch the binary's own decode output I created the following new data sets (hex-dump):
Original: decodes to "rpcinfo -p 127.0.0.1"
00000000 02001731 ba41bb3b c03dc3fa 3ec5fc44 ...1.A.;.=..>..D
00000010 8ddb2067 acf33880 97aec5dc f30a2138 .. g..8.......!8
00000020 4f667d94 abc2d9f0 071e354c 637a91a8 Of}.......5Lcz..
Trial: decodes to: "aaaaaaaaaaaaaaaaaaaaaa"
00000000 02001731 a9219911 890179f1 69e159d1 ...1.!....y.i.Y.
00000010 49c139b1 29a11991 09819861 78787878 I.9.)......axxxx
Attack: decodes to: "cat /etc/shadow"
00000000 02001731 ab23aee5 2ba732ac f27cfb73 ...1.#..+.2..|.s
00000010 ee740239 acf33880 97787878 78787878 .t.9..8..xxxxxxx
The choice of IP numbers for this reporting is, as near as I can tell randomly selected at activation and does not change within an instance. While I find this odd (and presumably not very useful), sending exactly identical 'wakeup' and 'command' sequences resulted in different outgoing IP connections at every instance, while within an instance the same half dozen targets are repeatedly accessed. I can only conclude that the targetting is either addressed in a separate command which would require a more complete disassembly of the binary, or is random.
The binary accepts and transmits data in the following form:
byte | value(s) | useage |
---|---|---|
00 | 02|03 | command |
01 | 00 | n/a-fill? |
02 | 00-ff | stop-difference |
03 | ?? | unknown? |
04-stop | encoded | payload/ command |
The test-data provided by the challenge includes control packets sent to the binary. These seem to consist of 2 'wakeup' commands, followed by a shell-command. The shell command will be run anytime it is asked for, however the network-replies are only initiated if the 'wakeups' have been sent priorly.
Ġoot:$1$ON4/ZLzZ$nOYPr6JXSXMB2FYKZXYXl0:11765:0:99999:7:::
daemon:x:11521:0:99999:7:::
bin:x:11521:0:99999:7:::
sys:x:11521:0:99999:7:::
We can see that not all of the data from the shell command is returned (and which part is sent seems to be a moving-target), however the result is effectively that the data gets out.
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name raw 0 0 0.0.0.0:255 0.0.0.0:* 7 10943/[mingetty]This has typically occured when sending random or invalid data to the sanboxed system in probing its operation. I tested that socket at one point by using 'sendip' to deliver data and watching the code's behavior with tcpdump. Essentially the binary does not seem to ever read any data from this socket. I presume it is either an unanalysed feature or a program bug.
raw 65056 0 0.0.0.0:255 0.0.0.0:* 7 10943/[mingetty]
Additionally there is the open question of why the binary's output packets which include an encoded report of the output of the attacker's shell command in plaintext in the output. Also I found that a part of the command packet (about 100 bytes) is duplicated in the response packet after the enocoded reply section.
Copyright © 2002 FW Systems LLC, All Rights Reserved