Honeynet Project Scan of the Month Challenge 22

Author

Will Dyke

Overview

This page provides a description of the general methodologies I used to determine exactly what happened in the attack presented in the snort log file.

I used a number of different methods in the process, including log analysis and reverse engineering techniques such as disassembly and debugging.

Tools used

Windows 2000 Professional

Suse 8.0

Used to run foo (after first disconnecting it from the rest of the network!)

SparcStation LX, Solaris 2.6

Used to communicate with foo, and analyse network traffic.

Attacker behaviour

To look at the behaviour on IP protocol 11, I initially attempted to get decoder.c provided on the challenge page to work. However, decoder.c requires libpcap to be installed, and I had trouble installing it under Cygwin. Instead, I used the decoder perl script written by Dion Mendel for the Reverse Challenge project (the script expects to find an executable called tcpdump - this can be achieved under Cygwin by creating a symbolic link to windump).

$ bin/decoder snort-0718\@1401.log > decode.log

Of all the commands executed, the 4th last is the most interesting. So, breaking down the command line into its separate fragments, we get:

To understand what the attacker is trying to achieve, it is therefore necessary to understand what foo/ttserve (henceforth referred to as foo) does.

Analysing foo

To obtain foo, I used Ethereal's Follow TCP Stream tool on packet 73, which provided the entire HTTP communication between the honeypot machine and the webserver.

The request headers are provided here.

GET /foo HTTP/1.0
Host: 216.242.103.2:8882
Accept: text/html, text/plain, audio/mod, image/*, video/*, video/mpeg, application/pgp, application/pgp, application/pdf, message/partial, message/external-body, application/postscript, x-be2, application/andrew-inset, text/richtext, text/enriched
Accept: x-sun-attachment, audio-file, postscript-file, default, mail-file, sun-deskset-message, application/x-metamail-patch, text/sgml, */*;q=0.01
Accept-Encoding: gzip, compress
Accept-Language: en
User-Agent: Lynx/2.8.3dev.18 libwww-FM/2.14

I then saved the result, and opened it in a text editor, snipped the request and response headers, and stored the result as foo for analysis.

Essentially, I followed the procedure described by Dion Mendel, the winner of the Reverse Challenge. I describe the technicalities of the reverse engineering separately. The results of the reverse engineering can be seen in an attempt to recreate the C code.

The first thing that foo does is to set its name to '(nfsiod)', to disguise itself in process listings. The sequence of function calls

fork(); setsid(); setuid(); seteuid(); fork(); chdir("/");

is a clear indication that it then becomes a daemon.

Going back to the packet log, in packet 529, a UDP communication is made from the honeypot server to "11.11.11.11" (spoofed, naturally - the actual address is 216.242.103.2) on port 53413 with data "GU\n". This is presumably foo "phoning home", saying "I'm ready", as this is immediately succeeded by the response "DU9207100". These are the only non-DNS UDP commands used in the captured session

In the part of the code referring to DU (lines 080485ce onward), it is clear that on receiving a DU, an iosscanf("%lu", &var) is performed - this obviously reads in the remainder of the command (the 9207100 part).

This command results in the next pattern of activity in the packet logs, which is a sequence of requests to web.icq.com, essentially getting the profile pages of all users with user ids in the range 9207100 - 9207199.

From these two pieces of evidence, the instruction DU9207100 is apparently a command to get the pages of 100 userids, starting from 9207100.

While it is difficult to determine from the raw code what the attacker is trying to achieve, I had two thoughts - a distributed denial of service, or email address harvesting. While investigating further, I recalled I'd seen a large block of code, with lots of compares to small constants (starting around 080487d4). Not wishing to go through every compare by hand, I wrote some C to substitute constants in the range 0x20-0x7f with their ASCII equivalents. The results are below:

$ bin/chartest dump4 > chartest.res

The lines from 0804885c to 080488d4, (cmpb with m,a,i,l,t,o,:, in sequence) strongly suggest to me that this is an email address harvester.

Active analysis of foo

Once I had exhausted my hand decompilation abilities, I moved onto active reverse engineering, through running foo. As by now I understood what all the system calls did, I was reasonably happy to run foo, with two alterations. I used hexedit to change the UDP connection address to one of my local machines (a Sun Sparcstation LX), and I disconnected my hub from the external network. This allowed me to sniff the local network, without any accidental external leaks.

I tracked the behaviour of foo by using strace, a program that's standard on Linux boxes (and probably some others too). This allowed me to check for things I'd missed in the decompilation - one of which was the port allocation in the sendto call in open_udp.

At the other end, I created a connection handler to receive UDP calls, and respond back with user-defined responses.

I was then able to use strace of foo at one end, and snoop the network while running the handler at the other end.

Unfortunately, I was able to do little with the handler program other than confirm that on receipt of a DU request, successive attempts to contact web.icq.com (which was unreachable, as I'd disconnected the network) were made.

Acknowledgements

Without the Reverse Challenge, it would have been significantly harder for me to have even started reverse engineering foo. In particular, the analysis of the-binary by Dion Mendel formed the basis of the reverse engineering here, and the tools he provided were a valuable asset.